Overview

Dataset statistics

Number of variables41
Number of observations59400
Missing cells46743
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.6 MiB
Average record size in memory328.0 B

Variable types

Numeric10
DateTime1
Text7
Categorical21
Boolean2

Alerts

recorded_by has constant value ""Constant
public_meeting is highly imbalanced (56.3%)Imbalance
management_group is highly imbalanced (69.3%)Imbalance
water_quality is highly imbalanced (71.3%)Imbalance
quality_group is highly imbalanced (68.0%)Imbalance
funder has 3637 (6.1%) missing valuesMissing
installer has 3655 (6.2%) missing valuesMissing
public_meeting has 3334 (5.6%) missing valuesMissing
scheme_management has 3878 (6.5%) missing valuesMissing
scheme_name has 28810 (48.5%) missing valuesMissing
permit has 3056 (5.1%) missing valuesMissing
amount_tsh is highly skewed (γ1 = 57.80779995)Skewed
num_private is highly skewed (γ1 = 91.93374999)Skewed
id is uniformly distributedUniform
id has unique valuesUnique
amount_tsh has 41639 (70.1%) zerosZeros
gps_height has 20438 (34.4%) zerosZeros
longitude has 1812 (3.1%) zerosZeros
num_private has 58643 (98.7%) zerosZeros
population has 21381 (36.0%) zerosZeros
construction_year has 20709 (34.9%) zerosZeros

Reproduction

Analysis started2024-02-05 09:12:22.486560
Analysis finished2024-02-05 09:12:45.688426
Duration23.2 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

id
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct59400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37115.132
Minimum0
Maximum74247
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:45.794780image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3730.9
Q118519.75
median37061.5
Q355656.5
95-th percentile70564.05
Maximum74247
Range74247
Interquartile range (IQR)37136.75

Descriptive statistics

Standard deviation21453.128
Coefficient of variation (CV)0.57801569
Kurtosis-1.201515
Mean37115.132
Median Absolute Deviation (MAD)18568.5
Skewness0.0026225303
Sum2.2046388 × 109
Variance4.6023672 × 108
MonotonicityNot monotonic
2024-02-05T10:12:45.971213image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
69572 1
 
< 0.1%
27851 1
 
< 0.1%
6924 1
 
< 0.1%
61097 1
 
< 0.1%
48517 1
 
< 0.1%
62700 1
 
< 0.1%
48914 1
 
< 0.1%
479 1
 
< 0.1%
12824 1
 
< 0.1%
21909 1
 
< 0.1%
Other values (59390) 59390
> 99.9%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
74247 1
< 0.1%
74246 1
< 0.1%
74243 1
< 0.1%
74242 1
< 0.1%
74240 1
< 0.1%
74239 1
< 0.1%
74238 1
< 0.1%
74237 1
< 0.1%
74236 1
< 0.1%
74235 1
< 0.1%

amount_tsh
Real number (ℝ)

SKEWED  ZEROS 

Distinct98
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean317.65038
Minimum0
Maximum350000
Zeros41639
Zeros (%)70.1%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:46.146660image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320
95-th percentile1200
Maximum350000
Range350000
Interquartile range (IQR)20

Descriptive statistics

Standard deviation2997.5746
Coefficient of variation (CV)9.43671
Kurtosis4903.5431
Mean317.65038
Median Absolute Deviation (MAD)0
Skewness57.8078
Sum18868433
Variance8985453.2
MonotonicityNot monotonic
2024-02-05T10:12:46.322883image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 41639
70.1%
500 3102
 
5.2%
50 2472
 
4.2%
1000 1488
 
2.5%
20 1463
 
2.5%
200 1220
 
2.1%
100 816
 
1.4%
10 806
 
1.4%
30 743
 
1.3%
2000 704
 
1.2%
Other values (88) 4947
 
8.3%
ValueCountFrequency (%)
0 41639
70.1%
0.2 3
 
< 0.1%
0.25 1
 
< 0.1%
1 3
 
< 0.1%
2 13
 
< 0.1%
5 376
 
0.6%
6 190
 
0.3%
7 69
 
0.1%
9 1
 
< 0.1%
10 806
 
1.4%
ValueCountFrequency (%)
350000 1
 
< 0.1%
250000 1
 
< 0.1%
200000 1
 
< 0.1%
170000 1
 
< 0.1%
138000 1
 
< 0.1%
120000 1
 
< 0.1%
117000 7
< 0.1%
100000 3
< 0.1%
70000 1
 
< 0.1%
60000 1
 
< 0.1%
Distinct356
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
Minimum2002-10-14 00:00:00
Maximum2013-12-03 00:00:00
2024-02-05T10:12:46.497445image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:46.679954image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

funder
Text

MISSING 

Distinct1896
Distinct (%)3.4%
Missing3637
Missing (%)6.1%
Memory size464.2 KiB
2024-02-05T10:12:47.008691image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length30
Median length27
Mean length9.930115
Min length1

Characters and Unicode

Total characters553733
Distinct characters69
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique974 ?
Unique (%)1.7%

Sample

1st rowRoman
2nd rowGrumeti
3rd rowLottery Club
4th rowUnicef
5th rowAction In A
ValueCountFrequency (%)
of 9748
 
10.8%
government 9276
 
10.3%
tanzania 9172
 
10.1%
danida 3123
 
3.5%
world 2789
 
3.1%
water 2645
 
2.9%
hesawa 2203
 
2.4%
bank 1416
 
1.6%
rwssp 1376
 
1.5%
kkkt 1370
 
1.5%
Other values (2064) 47252
52.3%
2024-02-05T10:12:47.516912image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 68200
 
12.3%
n 57840
 
10.4%
i 38011
 
6.9%
e 37462
 
6.8%
34673
 
6.3%
r 27879
 
5.0%
t 23016
 
4.2%
o 22739
 
4.1%
s 17208
 
3.1%
d 15464
 
2.8%
Other values (59) 211241
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 425874
76.9%
Uppercase Letter 89703
 
16.2%
Space Separator 34673
 
6.3%
Other Punctuation 1322
 
0.2%
Decimal Number 803
 
0.1%
Open Punctuation 437
 
0.1%
Close Punctuation 431
 
0.1%
Dash Punctuation 323
 
0.1%
Connector Punctuation 167
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 68200
16.0%
n 57840
13.6%
i 38011
 
8.9%
e 37462
 
8.8%
r 27879
 
6.5%
t 23016
 
5.4%
o 22739
 
5.3%
s 17208
 
4.0%
d 15464
 
3.6%
f 15329
 
3.6%
Other values (16) 102726
24.1%
Uppercase Letter
ValueCountFrequency (%)
T 12110
13.5%
G 10722
12.0%
O 10613
11.8%
D 7928
 
8.8%
W 7352
 
8.2%
C 4679
 
5.2%
R 4454
 
5.0%
H 3462
 
3.9%
M 3135
 
3.5%
K 2962
 
3.3%
Other values (16) 22286
24.8%
Decimal Number
ValueCountFrequency (%)
0 793
98.8%
2 5
 
0.6%
9 2
 
0.2%
1 2
 
0.2%
4 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/ 783
59.2%
. 469
35.5%
\ 33
 
2.5%
& 26
 
2.0%
' 11
 
0.8%
Open Punctuation
ValueCountFrequency (%)
( 434
99.3%
[ 3
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 429
99.5%
] 2
 
0.5%
Space Separator
ValueCountFrequency (%)
34673
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 323
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 167
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 515577
93.1%
Common 38156
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 68200
 
13.2%
n 57840
 
11.2%
i 38011
 
7.4%
e 37462
 
7.3%
r 27879
 
5.4%
t 23016
 
4.5%
o 22739
 
4.4%
s 17208
 
3.3%
d 15464
 
3.0%
f 15329
 
3.0%
Other values (42) 192429
37.3%
Common
ValueCountFrequency (%)
34673
90.9%
0 793
 
2.1%
/ 783
 
2.1%
. 469
 
1.2%
( 434
 
1.1%
) 429
 
1.1%
- 323
 
0.8%
_ 167
 
0.4%
\ 33
 
0.1%
& 26
 
0.1%
Other values (7) 26
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 553733
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 68200
 
12.3%
n 57840
 
10.4%
i 38011
 
6.9%
e 37462
 
6.8%
34673
 
6.3%
r 27879
 
5.0%
t 23016
 
4.2%
o 22739
 
4.1%
s 17208
 
3.1%
d 15464
 
2.8%
Other values (59) 211241
38.1%

gps_height
Real number (ℝ)

ZEROS 

Distinct2428
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean668.29724
Minimum-90
Maximum2770
Zeros20438
Zeros (%)34.4%
Negative1496
Negative (%)2.5%
Memory size464.2 KiB
2024-02-05T10:12:47.695974image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-90
5-th percentile0
Q10
median369
Q31319.25
95-th percentile1797
Maximum2770
Range2860
Interquartile range (IQR)1319.25

Descriptive statistics

Standard deviation693.11635
Coefficient of variation (CV)1.0371378
Kurtosis-1.2924401
Mean668.29724
Median Absolute Deviation (MAD)369
Skewness0.46240208
Sum39696856
Variance480410.28
MonotonicityNot monotonic
2024-02-05T10:12:47.861203image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 20438
34.4%
-15 60
 
0.1%
-16 55
 
0.1%
-13 55
 
0.1%
1290 52
 
0.1%
-20 52
 
0.1%
-14 51
 
0.1%
303 51
 
0.1%
-18 49
 
0.1%
-19 47
 
0.1%
Other values (2418) 38490
64.8%
ValueCountFrequency (%)
-90 1
 
< 0.1%
-63 2
 
< 0.1%
-59 1
 
< 0.1%
-57 1
 
< 0.1%
-55 1
 
< 0.1%
-54 1
 
< 0.1%
-53 1
 
< 0.1%
-52 2
 
< 0.1%
-51 2
 
< 0.1%
-50 5
< 0.1%
ValueCountFrequency (%)
2770 1
< 0.1%
2628 1
< 0.1%
2627 1
< 0.1%
2626 2
< 0.1%
2623 1
< 0.1%
2614 1
< 0.1%
2585 1
< 0.1%
2576 1
< 0.1%
2569 1
< 0.1%
2568 1
< 0.1%

installer
Text

MISSING 

Distinct2145
Distinct (%)3.8%
Missing3655
Missing (%)6.2%
Memory size464.2 KiB
2024-02-05T10:12:48.142730image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length30
Median length29
Mean length6.1112028
Min length1

Characters and Unicode

Total characters340669
Distinct characters70
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1098 ?
Unique (%)2.0%

Sample

1st rowRoman
2nd rowGRUMETI
3rd rowWorld vision
4th rowUNICEF
5th rowArtisan
ValueCountFrequency (%)
dwe 17601
25.8%
government 2778
 
4.1%
water 1881
 
2.8%
hesawa 1395
 
2.0%
rwe 1230
 
1.8%
district 1216
 
1.8%
kkkt 1153
 
1.7%
council 1106
 
1.6%
commu 1065
 
1.6%
danida 1051
 
1.5%
Other values (1976) 37806
55.4%
2024-02-05T10:12:48.605067image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
D 27595
 
8.1%
W 25849
 
7.6%
E 25389
 
7.5%
a 17343
 
5.1%
n 16558
 
4.9%
e 15500
 
4.5%
i 15053
 
4.4%
A 13668
 
4.0%
r 13377
 
3.9%
t 12904
 
3.8%
Other values (60) 157433
46.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 167438
49.1%
Lowercase Letter 158190
46.4%
Space Separator 12673
 
3.7%
Other Punctuation 971
 
0.3%
Decimal Number 783
 
0.2%
Dash Punctuation 268
 
0.1%
Connector Punctuation 169
 
< 0.1%
Open Punctuation 159
 
< 0.1%
Close Punctuation 16
 
< 0.1%
Currency Symbol 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 27595
16.5%
W 25849
15.4%
E 25389
15.2%
A 13668
8.2%
C 10535
 
6.3%
S 6659
 
4.0%
R 6518
 
3.9%
I 6160
 
3.7%
T 5948
 
3.6%
K 5390
 
3.2%
Other values (16) 33727
20.1%
Lowercase Letter
ValueCountFrequency (%)
a 17343
11.0%
n 16558
10.5%
e 15500
9.8%
i 15053
9.5%
r 13377
8.5%
t 12904
 
8.2%
o 12398
 
7.8%
m 9289
 
5.9%
l 6201
 
3.9%
s 6173
 
3.9%
Other values (16) 33394
21.1%
Other Punctuation
ValueCountFrequency (%)
/ 670
69.0%
. 238
 
24.5%
& 50
 
5.1%
' 12
 
1.2%
# 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 780
99.6%
9 1
 
0.1%
4 1
 
0.1%
1 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
} 13
81.2%
] 2
 
12.5%
) 1
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 157
98.7%
[ 2
 
1.3%
Space Separator
ValueCountFrequency (%)
12673
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 268
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 169
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 325628
95.6%
Common 15041
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 27595
 
8.5%
W 25849
 
7.9%
E 25389
 
7.8%
a 17343
 
5.3%
n 16558
 
5.1%
e 15500
 
4.8%
i 15053
 
4.6%
A 13668
 
4.2%
r 13377
 
4.1%
t 12904
 
4.0%
Other values (42) 142392
43.7%
Common
ValueCountFrequency (%)
12673
84.3%
0 780
 
5.2%
/ 670
 
4.5%
- 268
 
1.8%
. 238
 
1.6%
_ 169
 
1.1%
( 157
 
1.0%
& 50
 
0.3%
} 13
 
0.1%
' 12
 
0.1%
Other values (8) 11
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 340669
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
D 27595
 
8.1%
W 25849
 
7.6%
E 25389
 
7.5%
a 17343
 
5.1%
n 16558
 
4.9%
e 15500
 
4.5%
i 15053
 
4.4%
A 13668
 
4.0%
r 13377
 
3.9%
t 12904
 
3.8%
Other values (60) 157433
46.2%

longitude
Real number (ℝ)

ZEROS 

Distinct57516
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.077427
Minimum0
Maximum40.345193
Zeros1812
Zeros (%)3.1%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:48.809223image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30.04066
Q133.090347
median34.908743
Q337.178387
95-th percentile39.13324
Maximum40.345193
Range40.345193
Interquartile range (IQR)4.0880392

Descriptive statistics

Standard deviation6.5674318
Coefficient of variation (CV)0.19272089
Kurtosis19.187031
Mean34.077427
Median Absolute Deviation (MAD)2.0325111
Skewness-4.1910465
Sum2024199.1
Variance43.131161
MonotonicityNot monotonic
2024-02-05T10:12:49.315513image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1812
 
3.1%
37.37571687 2
 
< 0.1%
38.34050134 2
 
< 0.1%
39.08618257 2
 
< 0.1%
33.00503158 2
 
< 0.1%
39.09178536 2
 
< 0.1%
32.98751118 2
 
< 0.1%
37.23632569 2
 
< 0.1%
39.08628657 2
 
< 0.1%
39.08596496 2
 
< 0.1%
Other values (57506) 57570
96.9%
ValueCountFrequency (%)
0 1812
3.1%
29.6071219 1
 
< 0.1%
29.60720109 1
 
< 0.1%
29.61032056 1
 
< 0.1%
29.61096482 1
 
< 0.1%
29.61194674 1
 
< 0.1%
29.61250689 1
 
< 0.1%
29.61276296 1
 
< 0.1%
29.61344309 1
 
< 0.1%
29.6168718 1
 
< 0.1%
ValueCountFrequency (%)
40.34519307 1
< 0.1%
40.34430089 1
< 0.1%
40.32523996 1
< 0.1%
40.32522643 1
< 0.1%
40.32340181 1
< 0.1%
40.32283237 1
< 0.1%
40.32280453 1
< 0.1%
40.3226251 1
< 0.1%
40.32216902 1
< 0.1%
40.32196593 1
< 0.1%

latitude
Real number (ℝ)

Distinct57517
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-5.7060327
Minimum-11.64944
Maximum-2 × 10-8
Zeros0
Zeros (%)0.0%
Negative59400
Negative (%)100.0%
Memory size464.2 KiB
2024-02-05T10:12:49.543557image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-11.64944
5-th percentile-10.58555
Q1-8.5406213
median-5.0215966
Q3-3.3261556
95-th percentile-1.4088722
Maximum-2 × 10-8
Range11.64944
Interquartile range (IQR)5.2144657

Descriptive statistics

Standard deviation2.9460191
Coefficient of variation (CV)-0.51629902
Kurtosis-1.0576167
Mean-5.7060327
Median Absolute Deviation (MAD)2.0700299
Skewness-0.15203657
Sum-338938.34
Variance8.6790284
MonotonicityNot monotonic
2024-02-05T10:12:49.760256image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-2 × 10-81812
 
3.1%
-6.98584173 2
 
< 0.1%
-6.9802204 2
 
< 0.1%
-2.47667983 2
 
< 0.1%
-6.97826294 2
 
< 0.1%
-7.07808103 2
 
< 0.1%
-2.46524583 2
 
< 0.1%
-2.4943533 2
 
< 0.1%
-7.1772029 2
 
< 0.1%
-2.51532072 2
 
< 0.1%
Other values (57507) 57570
96.9%
ValueCountFrequency (%)
-11.64944018 1
< 0.1%
-11.64837759 1
< 0.1%
-11.58629656 1
< 0.1%
-11.56857679 1
< 0.1%
-11.56680457 1
< 0.1%
-11.56450865 1
< 0.1%
-11.56432357 1
< 0.1%
-11.56231592 1
< 0.1%
-11.56228898 1
< 0.1%
-11.56161898 1
< 0.1%
ValueCountFrequency (%)
-2 × 10-81812
3.1%
-0.99846435 1
 
< 0.1%
-0.998916 1
 
< 0.1%
-0.99901209 1
 
< 0.1%
-0.99911702 1
 
< 0.1%
-0.9994692 1
 
< 0.1%
-0.99950651 1
 
< 0.1%
-0.99952232 1
 
< 0.1%
-1.00058519 1
 
< 0.1%
-1.0015208 1
 
< 0.1%
Distinct37399
Distinct (%)63.0%
Missing2
Missing (%)< 0.1%
Memory size464.2 KiB
2024-02-05T10:12:50.180946image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length30
Median length25
Mean length10.962339
Min length1

Characters and Unicode

Total characters651141
Distinct characters75
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32928 ?
Unique (%)55.4%

Sample

1st rownone
2nd rowZahanati
3rd rowKwa Mahundi
4th rowZahanati Ya Nanyumbu
5th rowShuleni
ValueCountFrequency (%)
kwa 21384
 
19.6%
none 3563
 
3.3%
mzee 3385
 
3.1%
shuleni 2123
 
1.9%
ya 1499
 
1.4%
shule 1389
 
1.3%
school 1113
 
1.0%
primary 1052
 
1.0%
zahanati 983
 
0.9%
msingi 870
 
0.8%
Other values (29461) 71931
65.8%
2024-02-05T10:12:50.952169image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 98806
15.2%
i 52404
 
8.0%
49898
 
7.7%
n 42146
 
6.5%
e 40983
 
6.3%
w 31669
 
4.9%
K 31385
 
4.8%
o 30245
 
4.6%
u 24217
 
3.7%
M 22040
 
3.4%
Other values (65) 227348
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 493416
75.8%
Uppercase Letter 105183
 
16.2%
Space Separator 49898
 
7.7%
Decimal Number 1680
 
0.3%
Other Punctuation 741
 
0.1%
Dash Punctuation 104
 
< 0.1%
Open Punctuation 37
 
< 0.1%
Close Punctuation 37
 
< 0.1%
Connector Punctuation 24
 
< 0.1%
Modifier Symbol 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 98806
20.0%
i 52404
10.6%
n 42146
 
8.5%
e 40983
 
8.3%
w 31669
 
6.4%
o 30245
 
6.1%
u 24217
 
4.9%
l 20954
 
4.2%
m 17631
 
3.6%
h 17215
 
3.5%
Other values (16) 117146
23.7%
Uppercase Letter
ValueCountFrequency (%)
K 31385
29.8%
M 22040
21.0%
S 10752
 
10.2%
N 4878
 
4.6%
A 3497
 
3.3%
B 3425
 
3.3%
C 2791
 
2.7%
P 2564
 
2.4%
L 2507
 
2.4%
J 2385
 
2.3%
Other values (16) 18959
18.0%
Decimal Number
ValueCountFrequency (%)
1 507
30.2%
2 439
26.1%
3 152
 
9.0%
4 120
 
7.1%
7 106
 
6.3%
5 86
 
5.1%
6 80
 
4.8%
8 75
 
4.5%
9 70
 
4.2%
0 45
 
2.7%
Other Punctuation
ValueCountFrequency (%)
' 417
56.3%
. 175
23.6%
/ 146
 
19.7%
& 2
 
0.3%
\ 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 29
78.4%
[ 8
 
21.6%
Close Punctuation
ValueCountFrequency (%)
) 29
78.4%
] 8
 
21.6%
Space Separator
ValueCountFrequency (%)
49898
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 104
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 24
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 598599
91.9%
Common 52542
 
8.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 98806
16.5%
i 52404
 
8.8%
n 42146
 
7.0%
e 40983
 
6.8%
w 31669
 
5.3%
K 31385
 
5.2%
o 30245
 
5.1%
u 24217
 
4.0%
M 22040
 
3.7%
l 20954
 
3.5%
Other values (42) 203750
34.0%
Common
ValueCountFrequency (%)
49898
95.0%
1 507
 
1.0%
2 439
 
0.8%
' 417
 
0.8%
. 175
 
0.3%
3 152
 
0.3%
/ 146
 
0.3%
4 120
 
0.2%
7 106
 
0.2%
- 104
 
0.2%
Other values (13) 478
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 651141
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 98806
15.2%
i 52404
 
8.0%
49898
 
7.7%
n 42146
 
6.5%
e 40983
 
6.3%
w 31669
 
4.9%
K 31385
 
4.8%
o 30245
 
4.6%
u 24217
 
3.7%
M 22040
 
3.4%
Other values (65) 227348
34.9%

num_private
Real number (ℝ)

SKEWED  ZEROS 

Distinct65
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.47414141
Minimum0
Maximum1776
Zeros58643
Zeros (%)98.7%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:51.145045image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum1776
Range1776
Interquartile range (IQR)0

Descriptive statistics

Standard deviation12.23623
Coefficient of variation (CV)25.807131
Kurtosis11137.295
Mean0.47414141
Median Absolute Deviation (MAD)0
Skewness91.93375
Sum28164
Variance149.72532
MonotonicityNot monotonic
2024-02-05T10:12:51.345561image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 58643
98.7%
6 81
 
0.1%
1 73
 
0.1%
5 46
 
0.1%
8 46
 
0.1%
32 40
 
0.1%
45 36
 
0.1%
15 35
 
0.1%
39 30
 
0.1%
93 28
 
< 0.1%
Other values (55) 342
 
0.6%
ValueCountFrequency (%)
0 58643
98.7%
1 73
 
0.1%
2 23
 
< 0.1%
3 27
 
< 0.1%
4 20
 
< 0.1%
5 46
 
0.1%
6 81
 
0.1%
7 26
 
< 0.1%
8 46
 
0.1%
9 4
 
< 0.1%
ValueCountFrequency (%)
1776 1
< 0.1%
1402 1
< 0.1%
755 1
< 0.1%
698 1
< 0.1%
672 1
< 0.1%
668 1
< 0.1%
450 1
< 0.1%
300 1
< 0.1%
280 1
< 0.1%
240 1
< 0.1%

basin
Categorical

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
Lake Victoria
10248 
Pangani
8940 
Rufiji
7976 
Internal
7785 
Lake Tanganyika
6432 
Other values (4)
18019 

Length

Max length23
Median length11
Mean length10.892357
Min length6

Characters and Unicode

Total characters647006
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLake Nyasa
2nd rowLake Victoria
3rd rowPangani
4th rowRuvuma / Southern Coast
5th rowLake Victoria

Common Values

ValueCountFrequency (%)
Lake Victoria 10248
17.3%
Pangani 8940
15.1%
Rufiji 7976
13.4%
Internal 7785
13.1%
Lake Tanganyika 6432
10.8%
Wami / Ruvu 5987
10.1%
Lake Nyasa 5085
8.6%
Ruvuma / Southern Coast 4493
7.6%
Lake Rukwa 2454
 
4.1%

Length

2024-02-05T10:12:51.521011image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:51.697504image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
lake 24219
22.2%
10480
9.6%
victoria 10248
9.4%
pangani 8940
 
8.2%
rufiji 7976
 
7.3%
internal 7785
 
7.1%
tanganyika 6432
 
5.9%
wami 5987
 
5.5%
ruvu 5987
 
5.5%
nyasa 5085
 
4.7%
Other values (4) 15933
14.6%

Most occurring characters

ValueCountFrequency (%)
a 107025
16.5%
i 57807
 
8.9%
n 50807
 
7.9%
49672
 
7.7%
e 36497
 
5.6%
u 35883
 
5.5%
k 33105
 
5.1%
t 27019
 
4.2%
L 24219
 
3.7%
r 22526
 
3.5%
Other values (22) 202446
31.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 488262
75.5%
Uppercase Letter 98592
 
15.2%
Space Separator 49672
 
7.7%
Other Punctuation 10480
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 107025
21.9%
i 57807
11.8%
n 50807
10.4%
e 36497
 
7.5%
u 35883
 
7.3%
k 33105
 
6.8%
t 27019
 
5.5%
r 22526
 
4.6%
o 19234
 
3.9%
g 15372
 
3.1%
Other values (10) 82987
17.0%
Uppercase Letter
ValueCountFrequency (%)
L 24219
24.6%
R 20910
21.2%
V 10248
10.4%
P 8940
 
9.1%
I 7785
 
7.9%
T 6432
 
6.5%
W 5987
 
6.1%
N 5085
 
5.2%
S 4493
 
4.6%
C 4493
 
4.6%
Space Separator
ValueCountFrequency (%)
49672
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 10480
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 586854
90.7%
Common 60152
 
9.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 107025
18.2%
i 57807
 
9.9%
n 50807
 
8.7%
e 36497
 
6.2%
u 35883
 
6.1%
k 33105
 
5.6%
t 27019
 
4.6%
L 24219
 
4.1%
r 22526
 
3.8%
R 20910
 
3.6%
Other values (20) 171056
29.1%
Common
ValueCountFrequency (%)
49672
82.6%
/ 10480
 
17.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 647006
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 107025
16.5%
i 57807
 
8.9%
n 50807
 
7.9%
49672
 
7.7%
e 36497
 
5.6%
u 35883
 
5.5%
k 33105
 
5.1%
t 27019
 
4.2%
L 24219
 
3.7%
r 22526
 
3.5%
Other values (22) 202446
31.3%
Distinct19287
Distinct (%)32.7%
Missing371
Missing (%)0.6%
Memory size464.2 KiB
2024-02-05T10:12:51.988811image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length30
Median length27
Mean length7.8975927
Min length1

Characters and Unicode

Total characters466187
Distinct characters73
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9424 ?
Unique (%)16.0%

Sample

1st rowMnyusi B
2nd rowNyamara
3rd rowMajengo
4th rowMahakamani
5th rowKyanyamisa
ValueCountFrequency (%)
a 2387
 
3.4%
b 2043
 
2.9%
kati 1902
 
2.7%
majengo 610
 
0.9%
wa 600
 
0.8%
shuleni 593
 
0.8%
madukani 569
 
0.8%
mtaa 514
 
0.7%
juu 403
 
0.6%
mjini 378
 
0.5%
Other values (17024) 60795
85.9%
2024-02-05T10:12:52.465446image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 72003
15.4%
i 45666
 
9.8%
n 33499
 
7.2%
u 26424
 
5.7%
e 25671
 
5.5%
o 23556
 
5.1%
M 20431
 
4.4%
g 18951
 
4.1%
l 16372
 
3.5%
m 15053
 
3.2%
Other values (63) 168561
36.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 381263
81.8%
Uppercase Letter 71291
 
15.3%
Space Separator 11766
 
2.5%
Other Punctuation 1184
 
0.3%
Decimal Number 589
 
0.1%
Modifier Symbol 45
 
< 0.1%
Dash Punctuation 36
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%
Connector Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 72003
18.9%
i 45666
12.0%
n 33499
 
8.8%
u 26424
 
6.9%
e 25671
 
6.7%
o 23556
 
6.2%
g 18951
 
5.0%
l 16372
 
4.3%
m 15053
 
3.9%
b 11843
 
3.1%
Other values (16) 92225
24.2%
Uppercase Letter
ValueCountFrequency (%)
M 20431
28.7%
K 12545
17.6%
N 6068
 
8.5%
B 5112
 
7.2%
I 4503
 
6.3%
S 4039
 
5.7%
A 3076
 
4.3%
C 2533
 
3.6%
L 2458
 
3.4%
U 1704
 
2.4%
Other values (15) 8822
12.4%
Decimal Number
ValueCountFrequency (%)
1 242
41.1%
2 70
 
11.9%
3 50
 
8.5%
4 49
 
8.3%
6 33
 
5.6%
9 32
 
5.4%
8 32
 
5.4%
0 30
 
5.1%
5 29
 
4.9%
7 22
 
3.7%
Other Punctuation
ValueCountFrequency (%)
' 1017
85.9%
/ 136
 
11.5%
. 29
 
2.4%
# 2
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 4
80.0%
[ 1
 
20.0%
Close Punctuation
ValueCountFrequency (%)
) 4
80.0%
] 1
 
20.0%
Space Separator
ValueCountFrequency (%)
11766
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 45
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 452554
97.1%
Common 13633
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 72003
15.9%
i 45666
 
10.1%
n 33499
 
7.4%
u 26424
 
5.8%
e 25671
 
5.7%
o 23556
 
5.2%
M 20431
 
4.5%
g 18951
 
4.2%
l 16372
 
3.6%
m 15053
 
3.3%
Other values (41) 154928
34.2%
Common
ValueCountFrequency (%)
11766
86.3%
' 1017
 
7.5%
1 242
 
1.8%
/ 136
 
1.0%
2 70
 
0.5%
3 50
 
0.4%
4 49
 
0.4%
` 45
 
0.3%
- 36
 
0.3%
6 33
 
0.2%
Other values (12) 189
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 466187
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 72003
15.4%
i 45666
 
9.8%
n 33499
 
7.2%
u 26424
 
5.7%
e 25671
 
5.5%
o 23556
 
5.1%
M 20431
 
4.4%
g 18951
 
4.1%
l 16372
 
3.5%
m 15053
 
3.2%
Other values (63) 168561
36.2%

region
Categorical

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
Iringa
5294 
Shinyanga
4982 
Mbeya
4639 
Kilimanjaro
4379 
Morogoro
4006 
Other values (16)
36100 

Length

Max length13
Median length11
Mean length6.6237542
Min length4

Characters and Unicode

Total characters393451
Distinct characters32
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIringa
2nd rowMara
3rd rowManyara
4th rowMtwara
5th rowKagera

Common Values

ValueCountFrequency (%)
Iringa 5294
 
8.9%
Shinyanga 4982
 
8.4%
Mbeya 4639
 
7.8%
Kilimanjaro 4379
 
7.4%
Morogoro 4006
 
6.7%
Arusha 3350
 
5.6%
Kagera 3316
 
5.6%
Mwanza 3102
 
5.2%
Kigoma 2816
 
4.7%
Ruvuma 2640
 
4.4%
Other values (11) 20876
35.1%

Length

2024-02-05T10:12:52.642113image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
iringa 5294
 
8.7%
shinyanga 4982
 
8.2%
mbeya 4639
 
7.6%
kilimanjaro 4379
 
7.2%
morogoro 4006
 
6.6%
arusha 3350
 
5.5%
kagera 3316
 
5.4%
mwanza 3102
 
5.1%
kigoma 2816
 
4.6%
ruvuma 2640
 
4.3%
Other values (13) 22486
36.9%

Most occurring characters

ValueCountFrequency (%)
a 83413
21.2%
n 33143
 
8.4%
r 32397
 
8.2%
i 31763
 
8.1%
o 29580
 
7.5%
g 25054
 
6.4%
M 17029
 
4.3%
m 12841
 
3.3%
y 11204
 
2.8%
K 10511
 
2.7%
Other values (22) 106516
27.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 331636
84.3%
Uppercase Letter 60205
 
15.3%
Space Separator 1610
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 83413
25.2%
n 33143
 
10.0%
r 32397
 
9.8%
i 31763
 
9.6%
o 29580
 
8.9%
g 25054
 
7.6%
m 12841
 
3.9%
y 11204
 
3.4%
u 10438
 
3.1%
w 9275
 
2.8%
Other values (11) 52528
15.8%
Uppercase Letter
ValueCountFrequency (%)
M 17029
28.3%
K 10511
17.5%
S 7880
13.1%
I 5294
 
8.8%
T 4506
 
7.5%
R 4448
 
7.4%
A 3350
 
5.6%
D 3006
 
5.0%
P 2635
 
4.4%
L 1546
 
2.6%
Space Separator
ValueCountFrequency (%)
1610
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 391841
99.6%
Common 1610
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 83413
21.3%
n 33143
 
8.5%
r 32397
 
8.3%
i 31763
 
8.1%
o 29580
 
7.5%
g 25054
 
6.4%
M 17029
 
4.3%
m 12841
 
3.3%
y 11204
 
2.9%
K 10511
 
2.7%
Other values (21) 104906
26.8%
Common
ValueCountFrequency (%)
1610
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 393451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 83413
21.2%
n 33143
 
8.4%
r 32397
 
8.2%
i 31763
 
8.1%
o 29580
 
7.5%
g 25054
 
6.4%
M 17029
 
4.3%
m 12841
 
3.3%
y 11204
 
2.8%
K 10511
 
2.7%
Other values (22) 106516
27.1%

region_code
Real number (ℝ)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.297003
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:52.787134image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median12
Q317
95-th percentile60
Maximum99
Range98
Interquartile range (IQR)12

Descriptive statistics

Standard deviation17.587406
Coefficient of variation (CV)1.1497289
Kurtosis10.288433
Mean15.297003
Median Absolute Deviation (MAD)6
Skewness3.1738181
Sum908642
Variance309.31686
MonotonicityNot monotonic
2024-02-05T10:12:52.951736image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
11 5300
 
8.9%
17 5011
 
8.4%
12 4639
 
7.8%
3 4379
 
7.4%
5 4040
 
6.8%
18 3324
 
5.6%
19 3047
 
5.1%
2 3024
 
5.1%
16 2816
 
4.7%
10 2640
 
4.4%
Other values (17) 21180
35.7%
ValueCountFrequency (%)
1 2201
3.7%
2 3024
5.1%
3 4379
7.4%
4 2513
4.2%
5 4040
6.8%
6 1609
 
2.7%
7 805
 
1.4%
8 300
 
0.5%
9 390
 
0.7%
10 2640
4.4%
ValueCountFrequency (%)
99 423
 
0.7%
90 917
 
1.5%
80 1238
 
2.1%
60 1025
 
1.7%
40 1
 
< 0.1%
24 326
 
0.5%
21 1583
2.7%
20 1969
3.3%
19 3047
5.1%
18 3324
5.6%

district_code
Real number (ℝ)

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.6297475
Minimum0
Maximum80
Zeros23
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:53.106291image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q35
95-th percentile30
Maximum80
Range80
Interquartile range (IQR)3

Descriptive statistics

Standard deviation9.6336486
Coefficient of variation (CV)1.7112044
Kurtosis16.214284
Mean5.6297475
Median Absolute Deviation (MAD)1
Skewness3.9620453
Sum334407
Variance92.807186
MonotonicityNot monotonic
2024-02-05T10:12:53.262722image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
1 12203
20.5%
2 11173
18.8%
3 9998
16.8%
4 8999
15.1%
5 4356
 
7.3%
6 4074
 
6.9%
7 3343
 
5.6%
8 1043
 
1.8%
30 995
 
1.7%
33 874
 
1.5%
Other values (10) 2342
 
3.9%
ValueCountFrequency (%)
0 23
 
< 0.1%
1 12203
20.5%
2 11173
18.8%
3 9998
16.8%
4 8999
15.1%
5 4356
 
7.3%
6 4074
 
6.9%
7 3343
 
5.6%
8 1043
 
1.8%
13 391
 
0.7%
ValueCountFrequency (%)
80 12
 
< 0.1%
67 6
 
< 0.1%
63 195
 
0.3%
62 109
 
0.2%
60 63
 
0.1%
53 745
1.3%
43 505
0.9%
33 874
1.5%
30 995
1.7%
23 293
 
0.5%

lga
Text

Distinct125
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:53.541363image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length7.4168855
Min length3

Characters and Unicode

Total characters440563
Distinct characters41
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLudewa
2nd rowSerengeti
3rd rowSimanjiro
4th rowNanyumbu
5th rowKaragwe
ValueCountFrequency (%)
rural 9552
 
13.5%
njombe 2503
 
3.5%
urban 1683
 
2.4%
moshi 1330
 
1.9%
arusha 1315
 
1.9%
bariadi 1177
 
1.7%
singida 1172
 
1.7%
rungwe 1106
 
1.6%
kilosa 1094
 
1.5%
kasulu 1047
 
1.5%
Other values (106) 48656
68.9%
2024-02-05T10:12:54.021266image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 69982
15.9%
o 30079
 
6.8%
i 29483
 
6.7%
u 28324
 
6.4%
r 26886
 
6.1%
e 22579
 
5.1%
n 22521
 
5.1%
l 19238
 
4.4%
g 18385
 
4.2%
M 16017
 
3.6%
Other values (31) 157069
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 358693
81.4%
Uppercase Letter 70635
 
16.0%
Space Separator 11235
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 69982
19.5%
o 30079
 
8.4%
i 29483
 
8.2%
u 28324
 
7.9%
r 26886
 
7.5%
e 22579
 
6.3%
n 22521
 
6.3%
l 19238
 
5.4%
g 18385
 
5.1%
m 15622
 
4.4%
Other values (14) 75594
21.1%
Uppercase Letter
ValueCountFrequency (%)
M 16017
22.7%
R 12207
17.3%
K 11663
16.5%
S 6261
 
8.9%
N 5760
 
8.2%
B 4839
 
6.9%
U 3410
 
4.8%
I 2480
 
3.5%
L 2131
 
3.0%
T 1367
 
1.9%
Other values (6) 4500
 
6.4%
Space Separator
ValueCountFrequency (%)
11235
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 429328
97.4%
Common 11235
 
2.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 69982
16.3%
o 30079
 
7.0%
i 29483
 
6.9%
u 28324
 
6.6%
r 26886
 
6.3%
e 22579
 
5.3%
n 22521
 
5.2%
l 19238
 
4.5%
g 18385
 
4.3%
M 16017
 
3.7%
Other values (30) 145834
34.0%
Common
ValueCountFrequency (%)
11235
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 440563
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 69982
15.9%
o 30079
 
6.8%
i 29483
 
6.7%
u 28324
 
6.4%
r 26886
 
6.1%
e 22579
 
5.1%
n 22521
 
5.1%
l 19238
 
4.4%
g 18385
 
4.2%
M 16017
 
3.6%
Other values (31) 157069
35.7%

ward
Text

Distinct2092
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:54.314540image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length23
Median length19
Mean length7.5058418
Min length3

Characters and Unicode

Total characters445847
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.1%

Sample

1st rowMundindi
2nd rowNatta
3rd rowNgorika
4th rowNanyumbu
5th rowNyakasimbi
ValueCountFrequency (%)
mashariki 580
 
0.9%
urban 540
 
0.8%
siha 434
 
0.7%
kusini 393
 
0.6%
magharibi 362
 
0.6%
igosi 307
 
0.5%
masama 303
 
0.5%
machame 293
 
0.5%
kati 270
 
0.4%
imalinyi 252
 
0.4%
Other values (2106) 61033
94.2%
2024-02-05T10:12:54.775004image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 69533
15.6%
i 40243
 
9.0%
n 29584
 
6.6%
u 27015
 
6.1%
o 26093
 
5.9%
e 23589
 
5.3%
g 21166
 
4.7%
M 18916
 
4.2%
m 16216
 
3.6%
l 15799
 
3.5%
Other values (44) 157693
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 374730
84.0%
Uppercase Letter 64523
 
14.5%
Space Separator 5408
 
1.2%
Other Punctuation 1163
 
0.3%
Dash Punctuation 23
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 69533
18.6%
i 40243
10.7%
n 29584
 
7.9%
u 27015
 
7.2%
o 26093
 
7.0%
e 23589
 
6.3%
g 21166
 
5.6%
m 16216
 
4.3%
l 15799
 
4.2%
r 13057
 
3.5%
Other values (15) 92435
24.7%
Uppercase Letter
ValueCountFrequency (%)
M 18916
29.3%
K 11212
17.4%
I 6094
 
9.4%
N 5919
 
9.2%
S 3354
 
5.2%
L 3162
 
4.9%
B 3098
 
4.8%
U 2913
 
4.5%
C 2123
 
3.3%
R 1692
 
2.6%
Other values (15) 6040
 
9.4%
Other Punctuation
ValueCountFrequency (%)
' 1013
87.1%
/ 150
 
12.9%
Space Separator
ValueCountFrequency (%)
5408
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 439253
98.5%
Common 6594
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 69533
15.8%
i 40243
 
9.2%
n 29584
 
6.7%
u 27015
 
6.2%
o 26093
 
5.9%
e 23589
 
5.4%
g 21166
 
4.8%
M 18916
 
4.3%
m 16216
 
3.7%
l 15799
 
3.6%
Other values (40) 151099
34.4%
Common
ValueCountFrequency (%)
5408
82.0%
' 1013
 
15.4%
/ 150
 
2.3%
- 23
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 445847
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 69533
15.6%
i 40243
 
9.0%
n 29584
 
6.6%
u 27015
 
6.1%
o 26093
 
5.9%
e 23589
 
5.3%
g 21166
 
4.7%
M 18916
 
4.2%
m 16216
 
3.6%
l 15799
 
3.5%
Other values (44) 157693
35.4%

population
Real number (ℝ)

ZEROS 

Distinct1049
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean179.90998
Minimum0
Maximum30500
Zeros21381
Zeros (%)36.0%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:54.959289image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median25
Q3215
95-th percentile680
Maximum30500
Range30500
Interquartile range (IQR)215

Descriptive statistics

Standard deviation471.48218
Coefficient of variation (CV)2.620656
Kurtosis402.28012
Mean179.90998
Median Absolute Deviation (MAD)25
Skewness12.660714
Sum10686653
Variance222295.44
MonotonicityNot monotonic
2024-02-05T10:12:55.147211image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 21381
36.0%
1 7025
 
11.8%
200 1940
 
3.3%
150 1892
 
3.2%
250 1681
 
2.8%
300 1476
 
2.5%
100 1146
 
1.9%
50 1139
 
1.9%
500 1009
 
1.7%
350 986
 
1.7%
Other values (1039) 19725
33.2%
ValueCountFrequency (%)
0 21381
36.0%
1 7025
 
11.8%
2 4
 
< 0.1%
3 4
 
< 0.1%
4 13
 
< 0.1%
5 44
 
0.1%
6 19
 
< 0.1%
7 3
 
< 0.1%
8 23
 
< 0.1%
9 11
 
< 0.1%
ValueCountFrequency (%)
30500 1
 
< 0.1%
15300 1
 
< 0.1%
11463 1
 
< 0.1%
10000 3
< 0.1%
9865 1
 
< 0.1%
9500 1
 
< 0.1%
9000 3
< 0.1%
8848 1
 
< 0.1%
8600 1
 
< 0.1%
8500 1
 
< 0.1%

public_meeting
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing3334
Missing (%)5.6%
Memory size464.2 KiB
True
51011 
False
 
5055
(Missing)
 
3334
ValueCountFrequency (%)
True 51011
85.9%
False 5055
 
8.5%
(Missing) 3334
 
5.6%
2024-02-05T10:12:55.285634image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

recorded_by
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
GeoData Consultants Ltd
59400 

Length

Max length23
Median length23
Mean length23
Min length23

Characters and Unicode

Total characters1366200
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGeoData Consultants Ltd
2nd rowGeoData Consultants Ltd
3rd rowGeoData Consultants Ltd
4th rowGeoData Consultants Ltd
5th rowGeoData Consultants Ltd

Common Values

ValueCountFrequency (%)
GeoData Consultants Ltd 59400
100.0%

Length

2024-02-05T10:12:55.410341image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:55.518844image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
geodata 59400
33.3%
consultants 59400
33.3%
ltd 59400
33.3%

Most occurring characters

ValueCountFrequency (%)
t 237600
17.4%
a 178200
13.0%
o 118800
8.7%
118800
8.7%
n 118800
8.7%
s 118800
8.7%
G 59400
 
4.3%
e 59400
 
4.3%
D 59400
 
4.3%
C 59400
 
4.3%
Other values (4) 237600
17.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1009800
73.9%
Uppercase Letter 237600
 
17.4%
Space Separator 118800
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 237600
23.5%
a 178200
17.6%
o 118800
11.8%
n 118800
11.8%
s 118800
11.8%
e 59400
 
5.9%
u 59400
 
5.9%
l 59400
 
5.9%
d 59400
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
G 59400
25.0%
D 59400
25.0%
C 59400
25.0%
L 59400
25.0%
Space Separator
ValueCountFrequency (%)
118800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1247400
91.3%
Common 118800
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 237600
19.0%
a 178200
14.3%
o 118800
9.5%
n 118800
9.5%
s 118800
9.5%
G 59400
 
4.8%
e 59400
 
4.8%
D 59400
 
4.8%
C 59400
 
4.8%
u 59400
 
4.8%
Other values (3) 178200
14.3%
Common
ValueCountFrequency (%)
118800
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1366200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 237600
17.4%
a 178200
13.0%
o 118800
8.7%
118800
8.7%
n 118800
8.7%
s 118800
8.7%
G 59400
 
4.3%
e 59400
 
4.3%
D 59400
 
4.3%
C 59400
 
4.3%
Other values (4) 237600
17.4%

scheme_management
Categorical

MISSING 

Distinct11
Distinct (%)< 0.1%
Missing3878
Missing (%)6.5%
Memory size464.2 KiB
VWC
36793 
WUG
5206 
Water authority
 
3153
WUA
 
2883
Water Board
 
2748
Other values (6)
4739 

Length

Max length16
Median length3
Mean length4.6447354
Min length3

Characters and Unicode

Total characters257885
Distinct characters28
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVWC
2nd rowOther
3rd rowVWC
4th rowVWC
5th rowVWC

Common Values

ValueCountFrequency (%)
VWC 36793
61.9%
WUG 5206
 
8.8%
Water authority 3153
 
5.3%
WUA 2883
 
4.9%
Water Board 2748
 
4.6%
Parastatal 1680
 
2.8%
Private operator 1063
 
1.8%
Company 1061
 
1.8%
Other 766
 
1.3%
SWC 97
 
0.2%
(Missing) 3878
 
6.5%

Length

2024-02-05T10:12:55.711768image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
vwc 36793
58.9%
water 5901
 
9.4%
wug 5206
 
8.3%
authority 3153
 
5.0%
wua 2883
 
4.6%
board 2748
 
4.4%
parastatal 1680
 
2.7%
private 1063
 
1.7%
operator 1063
 
1.7%
company 1061
 
1.7%
Other values (3) 935
 
1.5%

Most occurring characters

ValueCountFrequency (%)
W 50880
19.7%
C 37951
14.7%
V 36793
14.3%
a 21709
8.4%
t 18531
 
7.2%
r 17509
 
6.8%
o 9088
 
3.5%
e 8793
 
3.4%
U 8089
 
3.1%
6964
 
2.7%
Other values (18) 41578
16.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 148228
57.5%
Lowercase Letter 102693
39.8%
Space Separator 6964
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 21709
21.1%
t 18531
18.0%
r 17509
17.0%
o 9088
8.8%
e 8793
8.6%
i 4216
 
4.1%
y 4214
 
4.1%
h 3919
 
3.8%
u 3225
 
3.1%
d 2748
 
2.7%
Other values (6) 8741
8.5%
Uppercase Letter
ValueCountFrequency (%)
W 50880
34.3%
C 37951
25.6%
V 36793
24.8%
U 8089
 
5.5%
G 5206
 
3.5%
A 2883
 
1.9%
B 2748
 
1.9%
P 2743
 
1.9%
O 766
 
0.5%
S 97
 
0.1%
Space Separator
ValueCountFrequency (%)
6964
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 250921
97.3%
Common 6964
 
2.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 50880
20.3%
C 37951
15.1%
V 36793
14.7%
a 21709
8.7%
t 18531
 
7.4%
r 17509
 
7.0%
o 9088
 
3.6%
e 8793
 
3.5%
U 8089
 
3.2%
G 5206
 
2.1%
Other values (17) 36372
14.5%
Common
ValueCountFrequency (%)
6964
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 257885
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
W 50880
19.7%
C 37951
14.7%
V 36793
14.3%
a 21709
8.4%
t 18531
 
7.2%
r 17509
 
6.8%
o 9088
 
3.5%
e 8793
 
3.4%
U 8089
 
3.1%
6964
 
2.7%
Other values (18) 41578
16.1%

scheme_name
Text

MISSING 

Distinct2695
Distinct (%)8.8%
Missing28810
Missing (%)48.5%
Memory size464.2 KiB
2024-02-05T10:12:55.956939image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length46
Median length37
Mean length14.522164
Min length1

Characters and Unicode

Total characters444233
Distinct characters68
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique712 ?
Unique (%)2.3%

Sample

1st rowRoman
2nd rowNyumba ya mungu pipe scheme
3rd rowZingibali
4th rowBL Bondeni
5th rowwanging'ombe water supply s
ValueCountFrequency (%)
water 9770
 
13.7%
supply 6745
 
9.5%
scheme 2532
 
3.5%
wa 2157
 
3.0%
gravity 1914
 
2.7%
pipe 1346
 
1.9%
maji 1343
 
1.9%
mradi 1097
 
1.5%
line 1016
 
1.4%
supplied 877
 
1.2%
Other values (2506) 42575
59.7%
2024-02-05T10:12:56.413183image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 48584
 
10.9%
41252
 
9.3%
e 34595
 
7.8%
i 26411
 
5.9%
p 22451
 
5.1%
r 21816
 
4.9%
t 19216
 
4.3%
u 18441
 
4.2%
l 17308
 
3.9%
n 17116
 
3.9%
Other values (58) 177043
39.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 351251
79.1%
Uppercase Letter 49420
 
11.1%
Space Separator 41252
 
9.3%
Other Punctuation 1317
 
0.3%
Dash Punctuation 554
 
0.1%
Open Punctuation 191
 
< 0.1%
Decimal Number 147
 
< 0.1%
Modifier Symbol 70
 
< 0.1%
Close Punctuation 31
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 48584
13.8%
e 34595
 
9.8%
i 26411
 
7.5%
p 22451
 
6.4%
r 21816
 
6.2%
t 19216
 
5.5%
u 18441
 
5.3%
l 17308
 
4.9%
n 17116
 
4.9%
o 16774
 
4.8%
Other values (16) 108539
30.9%
Uppercase Letter
ValueCountFrequency (%)
M 9314
18.8%
K 5600
11.3%
N 3795
 
7.7%
S 3770
 
7.6%
A 2729
 
5.5%
I 2691
 
5.4%
W 2531
 
5.1%
B 2387
 
4.8%
L 2107
 
4.3%
U 1790
 
3.6%
Other values (15) 12706
25.7%
Decimal Number
ValueCountFrequency (%)
2 61
41.5%
3 55
37.4%
1 7
 
4.8%
4 7
 
4.8%
7 7
 
4.8%
5 4
 
2.7%
0 3
 
2.0%
6 3
 
2.0%
Other Punctuation
ValueCountFrequency (%)
' 938
71.2%
/ 370
 
28.1%
& 8
 
0.6%
: 1
 
0.1%
Space Separator
ValueCountFrequency (%)
41252
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 554
100.0%
Open Punctuation
ValueCountFrequency (%)
( 191
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 70
100.0%
Close Punctuation
ValueCountFrequency (%)
) 31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 400671
90.2%
Common 43562
 
9.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 48584
 
12.1%
e 34595
 
8.6%
i 26411
 
6.6%
p 22451
 
5.6%
r 21816
 
5.4%
t 19216
 
4.8%
u 18441
 
4.6%
l 17308
 
4.3%
n 17116
 
4.3%
o 16774
 
4.2%
Other values (41) 157959
39.4%
Common
ValueCountFrequency (%)
41252
94.7%
' 938
 
2.2%
- 554
 
1.3%
/ 370
 
0.8%
( 191
 
0.4%
` 70
 
0.2%
2 61
 
0.1%
3 55
 
0.1%
) 31
 
0.1%
& 8
 
< 0.1%
Other values (7) 32
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 444233
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 48584
 
10.9%
41252
 
9.3%
e 34595
 
7.8%
i 26411
 
5.9%
p 22451
 
5.1%
r 21816
 
4.9%
t 19216
 
4.3%
u 18441
 
4.2%
l 17308
 
3.9%
n 17116
 
3.9%
Other values (58) 177043
39.9%

permit
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing3056
Missing (%)5.1%
Memory size464.2 KiB
True
38852 
False
17492 
(Missing)
 
3056
ValueCountFrequency (%)
True 38852
65.4%
False 17492
29.4%
(Missing) 3056
 
5.1%
2024-02-05T10:12:56.566387image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

construction_year
Real number (ℝ)

ZEROS 

Distinct55
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1300.6525
Minimum0
Maximum2013
Zeros20709
Zeros (%)34.9%
Negative0
Negative (%)0.0%
Memory size464.2 KiB
2024-02-05T10:12:56.708051image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1986
Q32004
95-th percentile2010
Maximum2013
Range2013
Interquartile range (IQR)2004

Descriptive statistics

Standard deviation951.62055
Coefficient of variation (CV)0.73164859
Kurtosis-1.5964324
Mean1300.6525
Median Absolute Deviation (MAD)22
Skewness-0.63492779
Sum77258757
Variance905581.67
MonotonicityNot monotonic
2024-02-05T10:12:56.891781image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 20709
34.9%
2010 2645
 
4.5%
2008 2613
 
4.4%
2009 2533
 
4.3%
2000 2091
 
3.5%
2007 1587
 
2.7%
2006 1471
 
2.5%
2003 1286
 
2.2%
2011 1256
 
2.1%
2004 1123
 
1.9%
Other values (45) 22086
37.2%
ValueCountFrequency (%)
0 20709
34.9%
1960 102
 
0.2%
1961 21
 
< 0.1%
1962 30
 
0.1%
1963 85
 
0.1%
1964 40
 
0.1%
1965 19
 
< 0.1%
1966 17
 
< 0.1%
1967 88
 
0.1%
1968 77
 
0.1%
ValueCountFrequency (%)
2013 176
 
0.3%
2012 1084
1.8%
2011 1256
2.1%
2010 2645
4.5%
2009 2533
4.3%
2008 2613
4.4%
2007 1587
2.7%
2006 1471
2.5%
2005 1011
 
1.7%
2004 1123
1.9%

extraction_type
Categorical

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
gravity
26780 
nira/tanira
8154 
other
6430 
submersible
4764 
swn 80
3670 
Other values (13)
9602 

Length

Max length25
Median length17
Mean length7.7195118
Min length3

Characters and Unicode

Total characters458539
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgravity
2nd rowgravity
3rd rowgravity
4th rowsubmersible
5th rowgravity

Common Values

ValueCountFrequency (%)
gravity 26780
45.1%
nira/tanira 8154
 
13.7%
other 6430
 
10.8%
submersible 4764
 
8.0%
swn 80 3670
 
6.2%
mono 2865
 
4.8%
india mark ii 2400
 
4.0%
afridev 1770
 
3.0%
ksb 1415
 
2.4%
other - rope pump 451
 
0.8%
Other values (8) 701
 
1.2%

Length

2024-02-05T10:12:57.340815image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gravity 26780
38.1%
nira/tanira 8154
 
11.6%
other 7197
 
10.2%
submersible 4764
 
6.8%
swn 3899
 
5.5%
80 3670
 
5.2%
mono 2865
 
4.1%
india 2498
 
3.6%
mark 2498
 
3.6%
ii 2400
 
3.4%
Other values (13) 5640
 
8.0%

Most occurring characters

ValueCountFrequency (%)
i 60078
13.1%
r 59768
13.0%
a 58179
12.7%
t 42131
9.2%
v 28550
 
6.2%
y 26867
 
5.9%
g 26782
 
5.8%
n 25691
 
5.6%
e 19036
 
4.2%
s 14844
 
3.2%
Other values (19) 96613
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 430853
94.0%
Space Separator 10965
 
2.4%
Other Punctuation 8156
 
1.8%
Decimal Number 7798
 
1.7%
Dash Punctuation 767
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 60078
13.9%
r 59768
13.9%
a 58179
13.5%
t 42131
9.8%
v 28550
6.6%
y 26867
 
6.2%
g 26782
 
6.2%
n 25691
 
6.0%
e 19036
 
4.4%
s 14844
 
3.4%
Other values (13) 68927
16.0%
Decimal Number
ValueCountFrequency (%)
8 3899
50.0%
0 3670
47.1%
1 229
 
2.9%
Space Separator
ValueCountFrequency (%)
10965
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 8156
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 767
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 430853
94.0%
Common 27686
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 60078
13.9%
r 59768
13.9%
a 58179
13.5%
t 42131
9.8%
v 28550
6.6%
y 26867
 
6.2%
g 26782
 
6.2%
n 25691
 
6.0%
e 19036
 
4.4%
s 14844
 
3.4%
Other values (13) 68927
16.0%
Common
ValueCountFrequency (%)
10965
39.6%
/ 8156
29.5%
8 3899
 
14.1%
0 3670
 
13.3%
- 767
 
2.8%
1 229
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 458539
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 60078
13.1%
r 59768
13.0%
a 58179
12.7%
t 42131
9.2%
v 28550
 
6.2%
y 26867
 
5.9%
g 26782
 
5.8%
n 25691
 
5.6%
e 19036
 
4.2%
s 14844
 
3.2%
Other values (19) 96613
21.1%
Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
gravity
26780 
nira/tanira
8154 
other
6430 
submersible
6179 
swn 80
3670 
Other values (8)
8187 

Length

Max length15
Median length14
Mean length7.8805387
Min length4

Characters and Unicode

Total characters468104
Distinct characters26
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgravity
2nd rowgravity
3rd rowgravity
4th rowsubmersible
5th rowgravity

Common Values

ValueCountFrequency (%)
gravity 26780
45.1%
nira/tanira 8154
 
13.7%
other 6430
 
10.8%
submersible 6179
 
10.4%
swn 80 3670
 
6.2%
mono 2865
 
4.8%
india mark ii 2400
 
4.0%
afridev 1770
 
3.0%
rope pump 451
 
0.8%
other handpump 364
 
0.6%
Other values (3) 337
 
0.6%

Length

2024-02-05T10:12:57.482507image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gravity 26780
38.8%
nira/tanira 8154
 
11.8%
other 6916
 
10.0%
submersible 6179
 
9.0%
swn 3670
 
5.3%
80 3670
 
5.3%
mono 2865
 
4.2%
mark 2498
 
3.6%
india 2498
 
3.6%
ii 2400
 
3.5%
Other values (7) 3373
 
4.9%

Most occurring characters

ValueCountFrequency (%)
i 61244
13.1%
r 61141
13.1%
a 58372
12.5%
t 41972
9.0%
v 28550
 
6.1%
g 26780
 
5.7%
y 26780
 
5.7%
n 25822
 
5.5%
e 21729
 
4.6%
s 16028
 
3.4%
Other values (16) 99686
21.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 442890
94.6%
Space Separator 9603
 
2.1%
Other Punctuation 8154
 
1.7%
Decimal Number 7340
 
1.6%
Dash Punctuation 117
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 61244
13.8%
r 61141
13.8%
a 58372
13.2%
t 41972
9.5%
v 28550
 
6.4%
g 26780
 
6.0%
y 26780
 
6.0%
n 25822
 
5.8%
e 21729
 
4.9%
s 16028
 
3.6%
Other values (11) 74472
16.8%
Decimal Number
ValueCountFrequency (%)
8 3670
50.0%
0 3670
50.0%
Space Separator
ValueCountFrequency (%)
9603
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 8154
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 117
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 442890
94.6%
Common 25214
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 61244
13.8%
r 61141
13.8%
a 58372
13.2%
t 41972
9.5%
v 28550
 
6.4%
g 26780
 
6.0%
y 26780
 
6.0%
n 25822
 
5.8%
e 21729
 
4.9%
s 16028
 
3.6%
Other values (11) 74472
16.8%
Common
ValueCountFrequency (%)
9603
38.1%
/ 8154
32.3%
8 3670
 
14.6%
0 3670
 
14.6%
- 117
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 468104
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 61244
13.1%
r 61141
13.1%
a 58372
12.5%
t 41972
9.0%
v 28550
 
6.1%
g 26780
 
5.7%
y 26780
 
5.7%
n 25822
 
5.5%
e 21729
 
4.6%
s 16028
 
3.4%
Other values (16) 99686
21.3%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
gravity
26780 
handpump
16456 
other
6430 
submersible
6179 
motorpump
2987 
Other values (2)
 
568

Length

Max length12
Median length11
Mean length7.6022391
Min length5

Characters and Unicode

Total characters451573
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgravity
2nd rowgravity
3rd rowgravity
4th rowsubmersible
5th rowgravity

Common Values

ValueCountFrequency (%)
gravity 26780
45.1%
handpump 16456
27.7%
other 6430
 
10.8%
submersible 6179
 
10.4%
motorpump 2987
 
5.0%
rope pump 451
 
0.8%
wind-powered 117
 
0.2%

Length

2024-02-05T10:12:57.640063image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:57.797584image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
gravity 26780
44.7%
handpump 16456
27.5%
other 6430
 
10.7%
submersible 6179
 
10.3%
motorpump 2987
 
5.0%
rope 451
 
0.8%
pump 451
 
0.8%
wind-powered 117
 
0.2%

Most occurring characters

ValueCountFrequency (%)
a 43236
 
9.6%
r 42944
 
9.5%
p 40356
 
8.9%
t 36197
 
8.0%
i 33076
 
7.3%
m 29060
 
6.4%
g 26780
 
5.9%
y 26780
 
5.9%
v 26780
 
5.9%
u 26073
 
5.8%
Other values (11) 120291
26.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 451005
99.9%
Space Separator 451
 
0.1%
Dash Punctuation 117
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 43236
 
9.6%
r 42944
 
9.5%
p 40356
 
8.9%
t 36197
 
8.0%
i 33076
 
7.3%
m 29060
 
6.4%
g 26780
 
5.9%
y 26780
 
5.9%
v 26780
 
5.9%
u 26073
 
5.8%
Other values (9) 119723
26.5%
Space Separator
ValueCountFrequency (%)
451
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 117
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 451005
99.9%
Common 568
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 43236
 
9.6%
r 42944
 
9.5%
p 40356
 
8.9%
t 36197
 
8.0%
i 33076
 
7.3%
m 29060
 
6.4%
g 26780
 
5.9%
y 26780
 
5.9%
v 26780
 
5.9%
u 26073
 
5.8%
Other values (9) 119723
26.5%
Common
ValueCountFrequency (%)
451
79.4%
- 117
 
20.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 451573
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 43236
 
9.6%
r 42944
 
9.5%
p 40356
 
8.9%
t 36197
 
8.0%
i 33076
 
7.3%
m 29060
 
6.4%
g 26780
 
5.9%
y 26780
 
5.9%
v 26780
 
5.9%
u 26073
 
5.8%
Other values (11) 120291
26.6%

management
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
vwc
40507 
wug
6515 
water board
 
2933
wua
 
2535
private operator
 
1971
Other values (7)
4939 

Length

Max length16
Median length3
Mean length4.3506397
Min length3

Characters and Unicode

Total characters258428
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowvwc
2nd rowwug
3rd rowvwc
4th rowvwc
5th rowother

Common Values

ValueCountFrequency (%)
vwc 40507
68.2%
wug 6515
 
11.0%
water board 2933
 
4.9%
wua 2535
 
4.3%
private operator 1971
 
3.3%
parastatal 1768
 
3.0%
water authority 904
 
1.5%
other 844
 
1.4%
company 685
 
1.2%
unknown 561
 
0.9%
Other values (2) 177
 
0.3%

Length

2024-02-05T10:12:57.971767image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
vwc 40507
61.9%
wug 6515
 
10.0%
water 3837
 
5.9%
board 2933
 
4.5%
wua 2535
 
3.9%
private 1971
 
3.0%
operator 1971
 
3.0%
parastatal 1768
 
2.7%
other 943
 
1.4%
authority 904
 
1.4%
Other values (5) 1522
 
2.3%

Most occurring characters

ValueCountFrequency (%)
w 53955
20.9%
v 42478
16.4%
c 41291
16.0%
a 21908
8.5%
r 16376
 
6.3%
t 14222
 
5.5%
u 10593
 
4.1%
o 10166
 
3.9%
e 8722
 
3.4%
g 6515
 
2.5%
Other values (13) 32202
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 252323
97.6%
Space Separator 6006
 
2.3%
Dash Punctuation 99
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 53955
21.4%
v 42478
16.8%
c 41291
16.4%
a 21908
8.7%
r 16376
 
6.5%
t 14222
 
5.6%
u 10593
 
4.2%
o 10166
 
4.0%
e 8722
 
3.5%
g 6515
 
2.6%
Other values (11) 26097
10.3%
Space Separator
ValueCountFrequency (%)
6006
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 99
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 252323
97.6%
Common 6105
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 53955
21.4%
v 42478
16.8%
c 41291
16.4%
a 21908
8.7%
r 16376
 
6.5%
t 14222
 
5.6%
u 10593
 
4.2%
o 10166
 
4.0%
e 8722
 
3.5%
g 6515
 
2.6%
Other values (11) 26097
10.3%
Common
ValueCountFrequency (%)
6006
98.4%
- 99
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 258428
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
w 53955
20.9%
v 42478
16.4%
c 41291
16.0%
a 21908
8.5%
r 16376
 
6.3%
t 14222
 
5.5%
u 10593
 
4.1%
o 10166
 
3.9%
e 8722
 
3.4%
g 6515
 
2.5%
Other values (13) 32202
12.5%

management_group
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
user-group
52490 
commercial
 
3638
parastatal
 
1768
other
 
943
unknown
 
561

Length

Max length10
Median length10
Mean length9.8922896
Min length5

Characters and Unicode

Total characters587602
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowuser-group
2nd rowuser-group
3rd rowuser-group
4th rowuser-group
5th rowother

Common Values

ValueCountFrequency (%)
user-group 52490
88.4%
commercial 3638
 
6.1%
parastatal 1768
 
3.0%
other 943
 
1.6%
unknown 561
 
0.9%

Length

2024-02-05T10:12:58.137234image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:58.274582image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
user-group 52490
88.4%
commercial 3638
 
6.1%
parastatal 1768
 
3.0%
other 943
 
1.6%
unknown 561
 
0.9%

Most occurring characters

ValueCountFrequency (%)
r 111329
18.9%
u 105541
18.0%
o 57632
9.8%
e 57071
9.7%
s 54258
9.2%
p 54258
9.2%
- 52490
8.9%
g 52490
8.9%
a 10710
 
1.8%
m 7276
 
1.2%
Other values (8) 24547
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 535112
91.1%
Dash Punctuation 52490
 
8.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 111329
20.8%
u 105541
19.7%
o 57632
10.8%
e 57071
10.7%
s 54258
10.1%
p 54258
10.1%
g 52490
9.8%
a 10710
 
2.0%
m 7276
 
1.4%
c 7276
 
1.4%
Other values (7) 17271
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 52490
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 535112
91.1%
Common 52490
 
8.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 111329
20.8%
u 105541
19.7%
o 57632
10.8%
e 57071
10.7%
s 54258
10.1%
p 54258
10.1%
g 52490
9.8%
a 10710
 
2.0%
m 7276
 
1.4%
c 7276
 
1.4%
Other values (7) 17271
 
3.2%
Common
ValueCountFrequency (%)
- 52490
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 587602
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 111329
18.9%
u 105541
18.0%
o 57632
9.8%
e 57071
9.7%
s 54258
9.2%
p 54258
9.2%
- 52490
8.9%
g 52490
8.9%
a 10710
 
1.8%
m 7276
 
1.2%
Other values (8) 24547
 
4.2%

payment
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
never pay
25348 
pay per bucket
8985 
pay monthly
8300 
unknown
8157 
pay when scheme fails
3914 
Other values (2)
4696 

Length

Max length21
Median length14
Mean length10.664798
Min length5

Characters and Unicode

Total characters633489
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpay annually
2nd rownever pay
3rd rowpay per bucket
4th rownever pay
5th rownever pay

Common Values

ValueCountFrequency (%)
never pay 25348
42.7%
pay per bucket 8985
 
15.1%
pay monthly 8300
 
14.0%
unknown 8157
 
13.7%
pay when scheme fails 3914
 
6.6%
pay annually 3642
 
6.1%
other 1054
 
1.8%

Length

2024-02-05T10:12:58.426735image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:58.572168image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
pay 50189
39.7%
never 25348
20.1%
per 8985
 
7.1%
bucket 8985
 
7.1%
monthly 8300
 
6.6%
unknown 8157
 
6.5%
when 3914
 
3.1%
scheme 3914
 
3.1%
fails 3914
 
3.1%
annually 3642
 
2.9%

Most occurring characters

ValueCountFrequency (%)
e 81462
12.9%
n 69317
10.9%
67002
10.6%
y 62131
9.8%
a 61387
9.7%
p 59174
9.3%
r 35387
 
5.6%
v 25348
 
4.0%
u 20784
 
3.3%
l 19498
 
3.1%
Other values (11) 131999
20.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 566487
89.4%
Space Separator 67002
 
10.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 81462
14.4%
n 69317
12.2%
y 62131
11.0%
a 61387
10.8%
p 59174
10.4%
r 35387
 
6.2%
v 25348
 
4.5%
u 20784
 
3.7%
l 19498
 
3.4%
t 18339
 
3.2%
Other values (10) 113660
20.1%
Space Separator
ValueCountFrequency (%)
67002
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 566487
89.4%
Common 67002
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 81462
14.4%
n 69317
12.2%
y 62131
11.0%
a 61387
10.8%
p 59174
10.4%
r 35387
 
6.2%
v 25348
 
4.5%
u 20784
 
3.7%
l 19498
 
3.4%
t 18339
 
3.2%
Other values (10) 113660
20.1%
Common
ValueCountFrequency (%)
67002
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 633489
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 81462
12.9%
n 69317
10.9%
67002
10.6%
y 62131
9.8%
a 61387
9.7%
p 59174
9.3%
r 35387
 
5.6%
v 25348
 
4.0%
u 20784
 
3.3%
l 19498
 
3.1%
Other values (11) 131999
20.8%

payment_type
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
never pay
25348 
per bucket
8985 
monthly
8300 
unknown
8157 
on failure
3914 
Other values (2)
4696 

Length

Max length10
Median length9
Mean length8.5307576
Min length5

Characters and Unicode

Total characters506727
Distinct characters20
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowannually
2nd rownever pay
3rd rowper bucket
4th rownever pay
5th rownever pay

Common Values

ValueCountFrequency (%)
never pay 25348
42.7%
per bucket 8985
 
15.1%
monthly 8300
 
14.0%
unknown 8157
 
13.7%
on failure 3914
 
6.6%
annually 3642
 
6.1%
other 1054
 
1.8%

Length

2024-02-05T10:12:58.742145image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:58.880206image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
never 25348
26.0%
pay 25348
26.0%
per 8985
 
9.2%
bucket 8985
 
9.2%
monthly 8300
 
8.5%
unknown 8157
 
8.4%
on 3914
 
4.0%
failure 3914
 
4.0%
annually 3642
 
3.7%
other 1054
 
1.1%

Most occurring characters

ValueCountFrequency (%)
e 73634
14.5%
n 69317
13.7%
r 39301
 
7.8%
38247
 
7.5%
y 37290
 
7.4%
a 36546
 
7.2%
p 34333
 
6.8%
v 25348
 
5.0%
u 24698
 
4.9%
o 21425
 
4.2%
Other values (10) 106588
21.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 468480
92.5%
Space Separator 38247
 
7.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 73634
15.7%
n 69317
14.8%
r 39301
8.4%
y 37290
8.0%
a 36546
7.8%
p 34333
 
7.3%
v 25348
 
5.4%
u 24698
 
5.3%
o 21425
 
4.6%
l 19498
 
4.2%
Other values (9) 87090
18.6%
Space Separator
ValueCountFrequency (%)
38247
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 468480
92.5%
Common 38247
 
7.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 73634
15.7%
n 69317
14.8%
r 39301
8.4%
y 37290
8.0%
a 36546
7.8%
p 34333
 
7.3%
v 25348
 
5.4%
u 24698
 
5.3%
o 21425
 
4.6%
l 19498
 
4.2%
Other values (9) 87090
18.6%
Common
ValueCountFrequency (%)
38247
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 506727
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 73634
14.5%
n 69317
13.7%
r 39301
 
7.8%
38247
 
7.5%
y 37290
 
7.4%
a 36546
 
7.2%
p 34333
 
6.8%
v 25348
 
5.0%
u 24698
 
4.9%
o 21425
 
4.2%
Other values (10) 106588
21.0%

water_quality
Categorical

IMBALANCE 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
soft
50818 
salty
 
4856
unknown
 
1876
milky
 
804
coloured
 
490
Other values (3)
 
556

Length

Max length18
Median length4
Mean length4.3032828
Min length4

Characters and Unicode

Total characters255615
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsoft
2nd rowsoft
3rd rowsoft
4th rowsoft
5th rowsoft

Common Values

ValueCountFrequency (%)
soft 50818
85.6%
salty 4856
 
8.2%
unknown 1876
 
3.2%
milky 804
 
1.4%
coloured 490
 
0.8%
salty abandoned 339
 
0.6%
fluoride 200
 
0.3%
fluoride abandoned 17
 
< 0.1%

Length

2024-02-05T10:12:59.057659image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:59.222162image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
soft 50818
85.0%
salty 5195
 
8.7%
unknown 1876
 
3.1%
milky 804
 
1.3%
coloured 490
 
0.8%
abandoned 356
 
0.6%
fluoride 217
 
0.4%

Most occurring characters

ValueCountFrequency (%)
s 56013
21.9%
t 56013
21.9%
o 54247
21.2%
f 51035
20.0%
l 6706
 
2.6%
n 6340
 
2.5%
y 5999
 
2.3%
a 5907
 
2.3%
k 2680
 
1.0%
u 2583
 
1.0%
Other values (9) 8092
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 255259
99.9%
Space Separator 356
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 56013
21.9%
t 56013
21.9%
o 54247
21.3%
f 51035
20.0%
l 6706
 
2.6%
n 6340
 
2.5%
y 5999
 
2.4%
a 5907
 
2.3%
k 2680
 
1.0%
u 2583
 
1.0%
Other values (8) 7736
 
3.0%
Space Separator
ValueCountFrequency (%)
356
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 255259
99.9%
Common 356
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 56013
21.9%
t 56013
21.9%
o 54247
21.3%
f 51035
20.0%
l 6706
 
2.6%
n 6340
 
2.5%
y 5999
 
2.4%
a 5907
 
2.3%
k 2680
 
1.0%
u 2583
 
1.0%
Other values (8) 7736
 
3.0%
Common
ValueCountFrequency (%)
356
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 255615
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 56013
21.9%
t 56013
21.9%
o 54247
21.2%
f 51035
20.0%
l 6706
 
2.6%
n 6340
 
2.5%
y 5999
 
2.3%
a 5907
 
2.3%
k 2680
 
1.0%
u 2583
 
1.0%
Other values (9) 8092
 
3.2%

quality_group
Categorical

IMBALANCE 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
good
50818 
salty
5195 
unknown
 
1876
milky
 
804
colored
 
490

Length

Max length8
Median length4
Mean length4.235101
Min length4

Characters and Unicode

Total characters251565
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgood
2nd rowgood
3rd rowgood
4th rowgood
5th rowgood

Common Values

ValueCountFrequency (%)
good 50818
85.6%
salty 5195
 
8.7%
unknown 1876
 
3.2%
milky 804
 
1.4%
colored 490
 
0.8%
fluoride 217
 
0.4%

Length

2024-02-05T10:12:59.393228image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:59.545380image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
good 50818
85.6%
salty 5195
 
8.7%
unknown 1876
 
3.2%
milky 804
 
1.4%
colored 490
 
0.8%
fluoride 217
 
0.4%

Most occurring characters

ValueCountFrequency (%)
o 104709
41.6%
d 51525
20.5%
g 50818
20.2%
l 6706
 
2.7%
y 5999
 
2.4%
n 5628
 
2.2%
t 5195
 
2.1%
a 5195
 
2.1%
s 5195
 
2.1%
k 2680
 
1.1%
Other values (8) 7915
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 251565
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 104709
41.6%
d 51525
20.5%
g 50818
20.2%
l 6706
 
2.7%
y 5999
 
2.4%
n 5628
 
2.2%
t 5195
 
2.1%
a 5195
 
2.1%
s 5195
 
2.1%
k 2680
 
1.1%
Other values (8) 7915
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 251565
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 104709
41.6%
d 51525
20.5%
g 50818
20.2%
l 6706
 
2.7%
y 5999
 
2.4%
n 5628
 
2.2%
t 5195
 
2.1%
a 5195
 
2.1%
s 5195
 
2.1%
k 2680
 
1.1%
Other values (8) 7915
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 251565
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 104709
41.6%
d 51525
20.5%
g 50818
20.2%
l 6706
 
2.7%
y 5999
 
2.4%
n 5628
 
2.2%
t 5195
 
2.1%
a 5195
 
2.1%
s 5195
 
2.1%
k 2680
 
1.1%
Other values (8) 7915
 
3.1%

quantity
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
enough
33186 
insufficient
15129 
dry
6246 
seasonal
4050 
unknown
 
789

Length

Max length12
Median length6
Mean length7.3623737
Min length3

Characters and Unicode

Total characters437325
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowenough
2nd rowinsufficient
3rd rowenough
4th rowdry
5th rowseasonal

Common Values

ValueCountFrequency (%)
enough 33186
55.9%
insufficient 15129
25.5%
dry 6246
 
10.5%
seasonal 4050
 
6.8%
unknown 789
 
1.3%

Length

2024-02-05T10:12:59.704292image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:12:59.838666image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
enough 33186
55.9%
insufficient 15129
25.5%
dry 6246
 
10.5%
seasonal 4050
 
6.8%
unknown 789
 
1.3%

Most occurring characters

ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 437325
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 437325
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 437325
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

quantity_group
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
enough
33186 
insufficient
15129 
dry
6246 
seasonal
4050 
unknown
 
789

Length

Max length12
Median length6
Mean length7.3623737
Min length3

Characters and Unicode

Total characters437325
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowenough
2nd rowinsufficient
3rd rowenough
4th rowdry
5th rowseasonal

Common Values

ValueCountFrequency (%)
enough 33186
55.9%
insufficient 15129
25.5%
dry 6246
 
10.5%
seasonal 4050
 
6.8%
unknown 789
 
1.3%

Length

2024-02-05T10:12:59.988011image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:13:00.128062image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
enough 33186
55.9%
insufficient 15129
25.5%
dry 6246
 
10.5%
seasonal 4050
 
6.8%
unknown 789
 
1.3%

Most occurring characters

ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 437325
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 437325
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 437325
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 69861
16.0%
e 52365
12.0%
u 49104
11.2%
i 45387
10.4%
o 38025
8.7%
g 33186
7.6%
h 33186
7.6%
f 30258
6.9%
s 23229
 
5.3%
t 15129
 
3.5%
Other values (8) 47595
10.9%

source
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
spring
17021 
shallow well
16824 
machine dbh
11075 
river
9612 
rainwater harvesting
2295 
Other values (5)
2573 

Length

Max length20
Median length12
Mean length8.9788047
Min length3

Characters and Unicode

Total characters533341
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowspring
2nd rowrainwater harvesting
3rd rowdam
4th rowmachine dbh
5th rowrainwater harvesting

Common Values

ValueCountFrequency (%)
spring 17021
28.7%
shallow well 16824
28.3%
machine dbh 11075
18.6%
river 9612
16.2%
rainwater harvesting 2295
 
3.9%
hand dtw 874
 
1.5%
lake 765
 
1.3%
dam 656
 
1.1%
other 212
 
0.4%
unknown 66
 
0.1%

Length

2024-02-05T10:13:00.301815image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:13:00.462331image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
spring 17021
18.8%
shallow 16824
18.6%
well 16824
18.6%
machine 11075
12.2%
dbh 11075
12.2%
river 9612
10.6%
rainwater 2295
 
2.5%
harvesting 2295
 
2.5%
hand 874
 
1.0%
dtw 874
 
1.0%
Other values (4) 1699
 
1.9%

Most occurring characters

ValueCountFrequency (%)
l 68061
12.8%
r 43342
 
8.1%
e 43078
 
8.1%
h 42355
 
7.9%
i 42298
 
7.9%
a 37079
 
7.0%
w 36883
 
6.9%
s 36140
 
6.8%
n 33758
 
6.3%
31068
 
5.8%
Other values (11) 119279
22.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 502273
94.2%
Space Separator 31068
 
5.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 68061
13.6%
r 43342
8.6%
e 43078
8.6%
h 42355
8.4%
i 42298
8.4%
a 37079
 
7.4%
w 36883
 
7.3%
s 36140
 
7.2%
n 33758
 
6.7%
g 19316
 
3.8%
Other values (10) 99963
19.9%
Space Separator
ValueCountFrequency (%)
31068
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 502273
94.2%
Common 31068
 
5.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 68061
13.6%
r 43342
8.6%
e 43078
8.6%
h 42355
8.4%
i 42298
8.4%
a 37079
 
7.4%
w 36883
 
7.3%
s 36140
 
7.2%
n 33758
 
6.7%
g 19316
 
3.8%
Other values (10) 99963
19.9%
Common
ValueCountFrequency (%)
31068
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 533341
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 68061
12.8%
r 43342
 
8.1%
e 43078
 
8.1%
h 42355
 
7.9%
i 42298
 
7.9%
a 37079
 
7.0%
w 36883
 
6.9%
s 36140
 
6.8%
n 33758
 
6.3%
31068
 
5.8%
Other values (11) 119279
22.4%

source_type
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
spring
17021 
shallow well
16824 
borehole
11949 
river/lake
10377 
rainwater harvesting
2295 
Other values (2)
 
934

Length

Max length20
Median length12
Mean length9.3036027
Min length3

Characters and Unicode

Total characters552634
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowspring
2nd rowrainwater harvesting
3rd rowdam
4th rowborehole
5th rowrainwater harvesting

Common Values

ValueCountFrequency (%)
spring 17021
28.7%
shallow well 16824
28.3%
borehole 11949
20.1%
river/lake 10377
17.5%
rainwater harvesting 2295
 
3.9%
dam 656
 
1.1%
other 278
 
0.5%

Length

2024-02-05T10:13:00.680110image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:13:00.831124image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
spring 17021
21.7%
shallow 16824
21.4%
well 16824
21.4%
borehole 11949
15.2%
river/lake 10377
13.2%
rainwater 2295
 
2.9%
harvesting 2295
 
2.9%
dam 656
 
0.8%
other 278
 
0.4%

Most occurring characters

ValueCountFrequency (%)
l 89622
16.2%
e 66344
12.0%
r 56887
10.3%
o 41000
 
7.4%
s 36140
 
6.5%
w 35943
 
6.5%
a 34742
 
6.3%
i 31988
 
5.8%
h 31346
 
5.7%
n 21611
 
3.9%
Other values (10) 107011
19.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 523138
94.7%
Space Separator 19119
 
3.5%
Other Punctuation 10377
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 89622
17.1%
e 66344
12.7%
r 56887
10.9%
o 41000
7.8%
s 36140
6.9%
w 35943
6.9%
a 34742
 
6.6%
i 31988
 
6.1%
h 31346
 
6.0%
n 21611
 
4.1%
Other values (8) 77515
14.8%
Space Separator
ValueCountFrequency (%)
19119
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 10377
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 523138
94.7%
Common 29496
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 89622
17.1%
e 66344
12.7%
r 56887
10.9%
o 41000
7.8%
s 36140
6.9%
w 35943
6.9%
a 34742
 
6.6%
i 31988
 
6.1%
h 31346
 
6.0%
n 21611
 
4.1%
Other values (8) 77515
14.8%
Common
ValueCountFrequency (%)
19119
64.8%
/ 10377
35.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 552634
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 89622
16.2%
e 66344
12.0%
r 56887
10.3%
o 41000
 
7.4%
s 36140
 
6.5%
w 35943
 
6.5%
a 34742
 
6.3%
i 31988
 
5.8%
h 31346
 
5.7%
n 21611
 
3.9%
Other values (10) 107011
19.4%

source_class
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
groundwater
45794 
surface
13328 
unknown
 
278

Length

Max length11
Median length11
Mean length10.083771
Min length7

Characters and Unicode

Total characters598976
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgroundwater
2nd rowsurface
3rd rowsurface
4th rowgroundwater
5th rowsurface

Common Values

ValueCountFrequency (%)
groundwater 45794
77.1%
surface 13328
 
22.4%
unknown 278
 
0.5%

Length

2024-02-05T10:13:00.997828image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:13:01.168427image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
groundwater 45794
77.1%
surface 13328
 
22.4%
unknown 278
 
0.5%

Most occurring characters

ValueCountFrequency (%)
r 104916
17.5%
u 59400
9.9%
a 59122
9.9%
e 59122
9.9%
n 46628
7.8%
o 46072
7.7%
w 46072
7.7%
g 45794
7.6%
d 45794
7.6%
t 45794
7.6%
Other values (4) 40262
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 598976
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 104916
17.5%
u 59400
9.9%
a 59122
9.9%
e 59122
9.9%
n 46628
7.8%
o 46072
7.7%
w 46072
7.7%
g 45794
7.6%
d 45794
7.6%
t 45794
7.6%
Other values (4) 40262
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 598976
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 104916
17.5%
u 59400
9.9%
a 59122
9.9%
e 59122
9.9%
n 46628
7.8%
o 46072
7.7%
w 46072
7.7%
g 45794
7.6%
d 45794
7.6%
t 45794
7.6%
Other values (4) 40262
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 598976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 104916
17.5%
u 59400
9.9%
a 59122
9.9%
e 59122
9.9%
n 46628
7.8%
o 46072
7.7%
w 46072
7.7%
g 45794
7.6%
d 45794
7.6%
t 45794
7.6%
Other values (4) 40262
 
6.7%

waterpoint_type
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
communal standpipe
28522 
hand pump
17488 
other
6380 
communal standpipe multiple
6103 
improved spring
 
784
Other values (2)
 
123

Length

Max length27
Median length18
Mean length14.827576
Min length3

Characters and Unicode

Total characters880758
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcommunal standpipe
2nd rowcommunal standpipe
3rd rowcommunal standpipe multiple
4th rowcommunal standpipe multiple
5th rowcommunal standpipe

Common Values

ValueCountFrequency (%)
communal standpipe 28522
48.0%
hand pump 17488
29.4%
other 6380
 
10.7%
communal standpipe multiple 6103
 
10.3%
improved spring 784
 
1.3%
cattle trough 116
 
0.2%
dam 7
 
< 0.1%

Length

2024-02-05T10:13:01.301579image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:13:01.436383image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
communal 34625
29.2%
standpipe 34625
29.2%
hand 17488
14.8%
pump 17488
14.8%
other 6380
 
5.4%
multiple 6103
 
5.1%
improved 784
 
0.7%
spring 784
 
0.7%
cattle 116
 
0.1%
trough 116
 
0.1%

Most occurring characters

ValueCountFrequency (%)
p 111897
12.7%
m 93632
10.6%
n 87522
9.9%
a 86861
9.9%
59116
 
6.7%
u 58332
 
6.6%
d 52904
 
6.0%
e 48008
 
5.5%
t 47456
 
5.4%
l 46947
 
5.3%
Other values (8) 188083
21.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 821642
93.3%
Space Separator 59116
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 111897
13.6%
m 93632
11.4%
n 87522
10.7%
a 86861
10.6%
u 58332
7.1%
d 52904
 
6.4%
e 48008
 
5.8%
t 47456
 
5.8%
l 46947
 
5.7%
i 42296
 
5.1%
Other values (7) 145787
17.7%
Space Separator
ValueCountFrequency (%)
59116
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 821642
93.3%
Common 59116
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 111897
13.6%
m 93632
11.4%
n 87522
10.7%
a 86861
10.6%
u 58332
7.1%
d 52904
 
6.4%
e 48008
 
5.8%
t 47456
 
5.8%
l 46947
 
5.7%
i 42296
 
5.1%
Other values (7) 145787
17.7%
Common
ValueCountFrequency (%)
59116
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 880758
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 111897
12.7%
m 93632
10.6%
n 87522
9.9%
a 86861
9.9%
59116
 
6.7%
u 58332
 
6.6%
d 52904
 
6.0%
e 48008
 
5.5%
t 47456
 
5.4%
l 46947
 
5.3%
Other values (8) 188083
21.4%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
communal standpipe
34625 
hand pump
17488 
other
6380 
improved spring
 
784
cattle trough
 
116

Length

Max length18
Median length18
Mean length13.902879
Min length3

Characters and Unicode

Total characters825831
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcommunal standpipe
2nd rowcommunal standpipe
3rd rowcommunal standpipe
4th rowcommunal standpipe
5th rowcommunal standpipe

Common Values

ValueCountFrequency (%)
communal standpipe 34625
58.3%
hand pump 17488
29.4%
other 6380
 
10.7%
improved spring 784
 
1.3%
cattle trough 116
 
0.2%
dam 7
 
< 0.1%

Length

2024-02-05T10:13:01.619496image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:13:01.769120image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
communal 34625
30.8%
standpipe 34625
30.8%
hand 17488
15.6%
pump 17488
15.6%
other 6380
 
5.7%
improved 784
 
0.7%
spring 784
 
0.7%
cattle 116
 
0.1%
trough 116
 
0.1%
dam 7
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
p 105794
12.8%
m 87529
10.6%
n 87522
10.6%
a 86861
10.5%
53013
 
6.4%
d 52904
 
6.4%
u 52229
 
6.3%
e 41905
 
5.1%
o 41905
 
5.1%
t 41353
 
5.0%
Other values (8) 174816
21.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 772818
93.6%
Space Separator 53013
 
6.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 105794
13.7%
m 87529
11.3%
n 87522
11.3%
a 86861
11.2%
d 52904
 
6.8%
u 52229
 
6.8%
e 41905
 
5.4%
o 41905
 
5.4%
t 41353
 
5.4%
i 36193
 
4.7%
Other values (7) 138623
17.9%
Space Separator
ValueCountFrequency (%)
53013
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 772818
93.6%
Common 53013
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 105794
13.7%
m 87529
11.3%
n 87522
11.3%
a 86861
11.2%
d 52904
 
6.8%
u 52229
 
6.8%
e 41905
 
5.4%
o 41905
 
5.4%
t 41353
 
5.4%
i 36193
 
4.7%
Other values (7) 138623
17.9%
Common
ValueCountFrequency (%)
53013
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 825831
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 105794
12.8%
m 87529
10.6%
n 87522
10.6%
a 86861
10.5%
53013
 
6.4%
d 52904
 
6.4%
u 52229
 
6.3%
e 41905
 
5.1%
o 41905
 
5.1%
t 41353
 
5.0%
Other values (8) 174816
21.2%

status_group
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
functional
32259 
non functional
22824 
functional needs repair
4317 

Length

Max length23
Median length10
Mean length12.481768
Min length10

Characters and Unicode

Total characters741417
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfunctional
2nd rowfunctional
3rd rowfunctional
4th rownon functional
5th rowfunctional

Common Values

ValueCountFrequency (%)
functional 32259
54.3%
non functional 22824
38.4%
functional needs repair 4317
 
7.3%

Length

2024-02-05T10:13:01.924666image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-05T10:13:02.052852image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
functional 59400
65.4%
non 22824
 
25.1%
needs 4317
 
4.8%
repair 4317
 
4.8%

Most occurring characters

ValueCountFrequency (%)
n 168765
22.8%
o 82224
11.1%
i 63717
 
8.6%
a 63717
 
8.6%
f 59400
 
8.0%
u 59400
 
8.0%
c 59400
 
8.0%
t 59400
 
8.0%
l 59400
 
8.0%
31458
 
4.2%
Other values (5) 34536
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 709959
95.8%
Space Separator 31458
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 168765
23.8%
o 82224
11.6%
i 63717
 
9.0%
a 63717
 
9.0%
f 59400
 
8.4%
u 59400
 
8.4%
c 59400
 
8.4%
t 59400
 
8.4%
l 59400
 
8.4%
e 12951
 
1.8%
Other values (4) 21585
 
3.0%
Space Separator
ValueCountFrequency (%)
31458
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 709959
95.8%
Common 31458
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 168765
23.8%
o 82224
11.6%
i 63717
 
9.0%
a 63717
 
9.0%
f 59400
 
8.4%
u 59400
 
8.4%
c 59400
 
8.4%
t 59400
 
8.4%
l 59400
 
8.4%
e 12951
 
1.8%
Other values (4) 21585
 
3.0%
Common
ValueCountFrequency (%)
31458
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 741417
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 168765
22.8%
o 82224
11.1%
i 63717
 
8.6%
a 63717
 
8.6%
f 59400
 
8.0%
u 59400
 
8.0%
c 59400
 
8.0%
t 59400
 
8.0%
l 59400
 
8.0%
31458
 
4.2%
Other values (5) 34536
 
4.7%

Interactions

2024-02-05T10:12:42.425442image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:28.838023image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:30.281537image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.584097image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:33.265088image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:34.743673image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:36.053765image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.424760image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:38.908296image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:40.848387image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:42.582011image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.040375image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:30.406682image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.733271image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:33.406740image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:34.870896image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:36.187306image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.567179image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:39.195791image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:40.975742image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:42.770562image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.228834image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:30.535923image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.866643image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:33.541898image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.012435image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:36.317045image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.731368image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:39.495887image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:41.381344image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:42.897049image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.368026image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:30.662966image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.970269image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:33.678748image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.135269image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:36.435082image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.844367image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:39.771055image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:41.503137image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:43.039551image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.493554image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:30.789656image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:32.095576image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:33.817465image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.268206image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:36.567453image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.967474image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:39.942734image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:41.664317image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:43.170529image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.633403image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:30.901806image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:32.228094image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:33.941675image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.384338image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:36.710107image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:38.090991image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:40.080789image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:41.787872image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:43.302752image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.754415image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.022858image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:32.702508image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:34.097402image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.527721image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:36.847152image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:38.221189image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:40.249270image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:41.914678image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:43.428740image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.870657image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.145180image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:32.843297image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:34.338542image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.667973image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.035723image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:38.336648image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:40.430300image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:42.033320image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:43.575424image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:29.999523image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.291894image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:32.996878image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:34.471234image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.800196image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.172824image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:38.470227image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:40.577108image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:42.170701image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:43.713039image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:30.128720image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:31.431319image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:33.132655image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:34.600894image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:35.917230image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:37.295057image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:38.625757image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:40.710607image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-05T10:12:42.292264image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Missing values

2024-02-05T10:12:44.071667image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-05T10:12:44.863571image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

idamount_tshdate_recordedfundergps_heightinstallerlongitudelatitudewpt_namenum_privatebasinsubvillageregionregion_codedistrict_codelgawardpopulationpublic_meetingrecorded_byscheme_managementscheme_namepermitconstruction_yearextraction_typeextraction_type_groupextraction_type_classmanagementmanagement_grouppaymentpayment_typewater_qualityquality_groupquantityquantity_groupsourcesource_typesource_classwaterpoint_typewaterpoint_type_groupstatus_group
0695726000.02011-03-14Roman1390Roman34.938093-9.856322none0Lake NyasaMnyusi BIringa115LudewaMundindi109TrueGeoData Consultants LtdVWCRomanFalse1999gravitygravitygravityvwcuser-grouppay annuallyannuallysoftgoodenoughenoughspringspringgroundwatercommunal standpipecommunal standpipefunctional
187760.02013-03-06Grumeti1399GRUMETI34.698766-2.147466Zahanati0Lake VictoriaNyamaraMara202SerengetiNatta280NaNGeoData Consultants LtdOtherNaNTrue2010gravitygravitygravitywuguser-groupnever paynever paysoftgoodinsufficientinsufficientrainwater harvestingrainwater harvestingsurfacecommunal standpipecommunal standpipefunctional
23431025.02013-02-25Lottery Club686World vision37.460664-3.821329Kwa Mahundi0PanganiMajengoManyara214SimanjiroNgorika250TrueGeoData Consultants LtdVWCNyumba ya mungu pipe schemeTrue2009gravitygravitygravityvwcuser-grouppay per bucketper bucketsoftgoodenoughenoughdamdamsurfacecommunal standpipe multiplecommunal standpipefunctional
3677430.02013-01-28Unicef263UNICEF38.486161-11.155298Zahanati Ya Nanyumbu0Ruvuma / Southern CoastMahakamaniMtwara9063NanyumbuNanyumbu58TrueGeoData Consultants LtdVWCNaNTrue1986submersiblesubmersiblesubmersiblevwcuser-groupnever paynever paysoftgooddrydrymachine dbhboreholegroundwatercommunal standpipe multiplecommunal standpipenon functional
4197280.02011-07-13Action In A0Artisan31.130847-1.825359Shuleni0Lake VictoriaKyanyamisaKagera181KaragweNyakasimbi0TrueGeoData Consultants LtdNaNNaNTrue0gravitygravitygravityotherothernever paynever paysoftgoodseasonalseasonalrainwater harvestingrainwater harvestingsurfacecommunal standpipecommunal standpipefunctional
5994420.02011-03-13Mkinga Distric Coun0DWE39.172796-4.765587Tajiri0PanganiMoa/MweremeTanga48MkingaMoa1TrueGeoData Consultants LtdVWCZingibaliTrue2009submersiblesubmersiblesubmersiblevwcuser-grouppay per bucketper bucketsaltysaltyenoughenoughotherotherunknowncommunal standpipe multiplecommunal standpipefunctional
6198160.02012-10-01Dwsp0DWSP33.362410-3.766365Kwa Ngomho0InternalIshinabulandiShinyanga173Shinyanga RuralSamuye0TrueGeoData Consultants LtdVWCNaNTrue0swn 80swn 80handpumpvwcuser-groupnever paynever paysoftgoodenoughenoughmachine dbhboreholegroundwaterhand pumphand pumpnon functional
7545510.02012-10-09Rwssp0DWE32.620617-4.226198Tushirikiane0Lake TanganyikaNyawishi CenterShinyanga173KahamaChambo0TrueGeoData Consultants LtdNaNNaNTrue0nira/taniranira/tanirahandpumpwuguser-groupunknownunknownmilkymilkyenoughenoughshallow wellshallow wellgroundwaterhand pumphand pumpnon functional
8539340.02012-11-03Wateraid0Water Aid32.711100-5.146712Kwa Ramadhan Musa0Lake TanganyikaImalaudukiTabora146Tabora UrbanItetemia0TrueGeoData Consultants LtdVWCNaNTrue0india mark iiindia mark iihandpumpvwcuser-groupnever paynever paysaltysaltyseasonalseasonalmachine dbhboreholegroundwaterhand pumphand pumpnon functional
9461440.02011-08-03Isingiro Ho0Artisan30.626991-1.257051Kwapeto0Lake VictoriaMkonomreKagera181KaragweKaisho0TrueGeoData Consultants LtdNaNNaNTrue0nira/taniranira/tanirahandpumpvwcuser-groupnever paynever paysoftgoodenoughenoughshallow wellshallow wellgroundwaterhand pumphand pumpfunctional
idamount_tshdate_recordedfundergps_heightinstallerlongitudelatitudewpt_namenum_privatebasinsubvillageregionregion_codedistrict_codelgawardpopulationpublic_meetingrecorded_byscheme_managementscheme_namepermitconstruction_yearextraction_typeextraction_type_groupextraction_type_classmanagementmanagement_grouppaymentpayment_typewater_qualityquality_groupquantityquantity_groupsourcesource_typesource_classwaterpoint_typewaterpoint_type_groupstatus_group
59390136770.02011-08-04Rudep1715DWE31.370848-8.258160Kwa Mzee Atanas0Lake TanganyikaKitontoRukwa152Sumbawanga RuralMkowe150TrueGeoData Consultants LtdVWCNaNFalse1991swn 80swn 80handpumpvwcuser-groupnever paynever paysoftgoodinsufficientinsufficientmachine dbhboreholegroundwaterhand pumphand pumpfunctional
59391448850.02013-08-03Government Of Tanzania540Government38.044070-4.272218Kwa0PanganiMaore KatiKilimanjaro33SameMaore210TrueGeoData Consultants LtdWater authorityHingililiTrue1967gravitygravitygravityvwcuser-groupnever paynever paysoftgoodenoughenoughriverriver/lakesurfacecommunal standpipecommunal standpipenon functional
59392406070.02011-04-15Government Of Tanzania0Government33.009440-8.520888Benard Charles0Lake RukwaMbuyuni AMbeya121ChunyaMbuyuni0TrueGeoData Consultants LtdVWCNaNTrue0gravitygravitygravityvwcuser-groupnever paynever paysoftgoodenoughenoughspringspringgroundwatercommunal standpipecommunal standpipenon functional
59393483480.02012-10-27Private0Private33.866852-4.287410Kwa Peter0InternalMasangaTabora142IgungaIgunga0FalseGeoData Consultants LtdWater authorityNaNFalse0gravitygravitygravityprivate operatorcommercialpay per bucketper bucketsoftgoodinsufficientinsufficientdamdamsurfaceotherotherfunctional
5939411164500.02011-03-09World Bank351ML appro37.634053-6.124830Chimeredya0Wami / RuvuKomstariMorogoro56MvomeroDiongoya89TrueGeoData Consultants LtdVWCNaNTrue2007submersiblesubmersiblesubmersiblevwcuser-grouppay monthlymonthlysoftgoodenoughenoughmachine dbhboreholegroundwatercommunal standpipecommunal standpipenon functional
593956073910.02013-05-03Germany Republi1210CES37.169807-3.253847Area Three Namba 270PanganiKiduruniKilimanjaro35HaiMasama Magharibi125TrueGeoData Consultants LtdWater BoardLosaa Kia water supplyTrue1999gravitygravitygravitywater boarduser-grouppay per bucketper bucketsoftgoodenoughenoughspringspringgroundwatercommunal standpipecommunal standpipefunctional
59396272634700.02011-05-07Cefa-njombe1212Cefa35.249991-9.070629Kwa Yahona Kuvala0RufijiIgumbiloIringa114NjombeIkondo56TrueGeoData Consultants LtdVWCIkondo electrical water schTrue1996gravitygravitygravityvwcuser-grouppay annuallyannuallysoftgoodenoughenoughriverriver/lakesurfacecommunal standpipecommunal standpipefunctional
59397370570.02011-04-11NaN0NaN34.017087-8.750434Mashine0RufijiMadunguluMbeya127MbaraliChimala0TrueGeoData Consultants LtdVWCNaNFalse0swn 80swn 80handpumpvwcuser-grouppay monthlymonthlyfluoridefluorideenoughenoughmachine dbhboreholegroundwaterhand pumphand pumpfunctional
59398312820.02011-03-08Malec0Musa35.861315-6.378573Mshoro0RufijiMwinyiDodoma14ChamwinoMvumi Makulu0TrueGeoData Consultants LtdVWCNaNTrue0nira/taniranira/tanirahandpumpvwcuser-groupnever paynever paysoftgoodinsufficientinsufficientshallow wellshallow wellgroundwaterhand pumphand pumpfunctional
59399263480.02011-03-23World Bank191World38.104048-6.747464Kwa Mzee Lugawa0Wami / RuvuKikatanyembaMorogoro52Morogoro RuralNgerengere150TrueGeoData Consultants LtdVWCNaNTrue2002nira/taniranira/tanirahandpumpvwcuser-grouppay when scheme failson failuresaltysaltyenoughenoughshallow wellshallow wellgroundwaterhand pumphand pumpfunctional